Offensive Language detection Using Machine Learning (OLDUM) aims at developing a prototype of a system that, using machine learning, will be capable of detecting offensive words in Pashto language, helping in automating the process of AUDIO/VOICE notes by the social media Applications/Website and th
Offensive Language Detection Using Machine Learning
Offensive Language detection Using Machine Learning (OLDUM) aims at developing a prototype of a system that, using machine learning, will be capable of detecting offensive words in Pashto language, helping in automating the process of AUDIO/VOICE notes by the social media Applications/Website and therefore stopping any offensive activity.
This project will act as a prototype for law enforcement agencies/social media platforms to detect offensive talks among people conversing on phone calls/in audio files. Through this system, we want to help law enforcement/social media companies in tracing such critical calls and stop use of offensive language/cyber bullying.
The scope of our FYP is limited to selected words only. We will be covering limited words of pushto language which will be enough to train and deploy OLDUM. Also, initially, we will be training OLDUM for isolated words and connected words. Later on, after the completion of FYP prototype, the system can be upgraded to spontaneous speech.
Records suspicious words. Call audios as input to the system. Monitor calls based on suspicious words stored in the dataset. Recording will be marked as suspicious so that it can be reviewed by the user.
Creation of suspicious and non-suspicious words dataset.
When the system is provided with new audio it should match the audio words with already given dataset.Dataset can be updated later.It will be able to process one call at a time.The dataset will be limited to selected words and will be trained according to those words, although it will have an option to expand the dataset according to the needs.The system will need high processing power, so we will have to take care of those specifications.
Keeping up abreast of the new keywords and monitoring all calls manually is a gigantic problem faced by law enforcement agencies/Social Application. Many efforts have been put forward by researchers to systematically digitize this process through machine learning techniques. Pushtu language is still a problem for the agencies to cop with as Urdu and English based systems, all ready exits in the literature.
It benifits the non-rich language like pushto , in the use of social media to make the enviroment user friendly
Final deliverable will consist of a software (installed on a raspberry pi ) and hardware
The system will hear specific words (audio) from pushto language using a mic and mark it a offensive or not
The software will have a login prompt where user can register first than upload the audio to checck for offensive words
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| rasberry Pi 4 Modle B | Equipment | 1 | 25000 | 25000 |
| mic | Equipment | 2 | 4000 | 8000 |
| extra | Miscellaneous | 1 | 7000 | 7000 |
| Total in (Rs) | 40000 |
Atmospheric water generation is one of the promising methods for getting pure water. Becau...
In this project, we employ the Distributed Ledger Technology and the Blockchain for castin...
A web application with real time and secure data transfer that will ensure the secure chat...
The main purpose of our project is to implement a conversational AI chatbot but that chatb...
According to WHO (World health organization), around 4000 fatalities occur in pakista...