Offensive Language detection Using Machine Learning (OLDUM) aims at developing a prototype of a system that, using machine learning, will be capable of detecting offensive words in Pashto language, helping in automating the process of AUDIO/VOICE notes by the social media Applications/Website and th
Offensive Language Detection Using Machine Learning
Offensive Language detection Using Machine Learning (OLDUM) aims at developing a prototype of a system that, using machine learning, will be capable of detecting offensive words in Pashto language, helping in automating the process of AUDIO/VOICE notes by the social media Applications/Website and therefore stopping any offensive activity.
This project will act as a prototype for law enforcement agencies/social media platforms to detect offensive talks among people conversing on phone calls/in audio files. Through this system, we want to help law enforcement/social media companies in tracing such critical calls and stop use of offensive language/cyber bullying.
The scope of our FYP is limited to selected words only. We will be covering limited words of pushto language which will be enough to train and deploy OLDUM. Also, initially, we will be training OLDUM for isolated words and connected words. Later on, after the completion of FYP prototype, the system can be upgraded to spontaneous speech.
Records suspicious words. Call audios as input to the system. Monitor calls based on suspicious words stored in the dataset. Recording will be marked as suspicious so that it can be reviewed by the user.
Creation of suspicious and non-suspicious words dataset.
When the system is provided with new audio it should match the audio words with already given dataset.Dataset can be updated later.It will be able to process one call at a time.The dataset will be limited to selected words and will be trained according to those words, although it will have an option to expand the dataset according to the needs.The system will need high processing power, so we will have to take care of those specifications.
Keeping up abreast of the new keywords and monitoring all calls manually is a gigantic problem faced by law enforcement agencies/Social Application. Many efforts have been put forward by researchers to systematically digitize this process through machine learning techniques. Pushtu language is still a problem for the agencies to cop with as Urdu and English based systems, all ready exits in the literature.
It benifits the non-rich language like pushto , in the use of social media to make the enviroment user friendly
Final deliverable will consist of a software (installed on a raspberry pi ) and hardware
The system will hear specific words (audio) from pushto language using a mic and mark it a offensive or not
The software will have a login prompt where user can register first than upload the audio to checck for offensive words
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| rasberry Pi 4 Modle B | Equipment | 1 | 25000 | 25000 |
| mic | Equipment | 2 | 4000 | 8000 |
| extra | Miscellaneous | 1 | 7000 | 7000 |
| Total in (Rs) | 40000 |
Ethylene dichloride (EDC) is used primarily for the production of vinyl chloride monomer (...
This Project we will achieve through two major parts like Dynamic Stability and obstacle d...
This project actually work for one place at a time just like a restaurant so, by usin...
Distribution transformers are one of the most important and necessary power system equipme...
The future of clean and efficient energy in Pakistan greatly relies within renewable energ...