Cyberbullying detection system is an automated system that will detect bully comments against various user. Cyberbullying is a form of bullying or harassment using electronic means. It is of the utmost importance to detect cyberbullying in multiple languages. Since current approaches to
Automated Cyberbullying Detection
Cyberbullying detection system is an automated system that will detect bully comments against various user. Cyberbullying is a form of bullying or harassment using electronic means. It is of the utmost importance to detect cyberbullying in multiple languages. Since current approaches to identify cyberbullying are mostly focused on English language texts, this project proposes a new approach for the detection of cyberbullying in multiple languages (English, Hindi, and Roman Urdu). It uses techniques of machines learning it classifies data on several algorithms like SVM (Support Vector Machine) Naive Bayes and many more to classify the input data as bullying or non-bullying. It would be beneficial for the department of FIA named National Response center for cyber crime that passed a law against cyber bulling. As a law passed by national assembly prevention of electronic crime act 2016 cyber stacking is considered as a crime and my project would be beneficial for them to detect cyber bullying easily.
Project implementation will be based on Machine Learning Techniques.Model will be trained on knowledge base approch.Decission should be made on labeled data.Cyber bullying detection system will provide a web interface where user can write a comment and it can be detected by various machine learning algorithms. To incorporate various distributed techniques into the proposed system and study their consequences on the time and the precision of detection of cyber bullying content.
Our approach has the following limitations:
• Sarcasm detection is out of the scope of our proposed system.
• The system can handle messages from only one language at a time.
• Finding Roman Urdu datasets can be a challenging task
It would be beneficial for the department of FIA named National Response center for cyber crime that passed a law against cyber bulling. As a law passed by national assembly prevention of electronic crime act 2016 cyber bullying is considered as a crime and my project would be beneficial for them to detect cyber bullying easily.
Automated CyberBullying detection will be completed using Machine Learning technique by using various algorithms such as SVM, Naive Bayes, Random forest and many more to get accurate accuracy.
The following is the process that we employed while creating the machine-learning model:

Input data :
We will collect dataset from twitter using tweet binder which is a online service for data collection.
Data preprocessing:
We could not use data which is not accurate due to various reason such as presence of special characters and stop words . Hence, we will remove these stop words (e.g., a, are ,an and, the) and unnecessary characters (like, #, @ , $ , % and URLs).
Feature Extraction:
It will be perform to obtain elements like pronoun, adjective, noun, short hand text in the comments, statistics on the presence of the words in the posts.
Train the model:
We will then divide the dataset and use 80% of the data set for training purposes and 20% for testing purpose then we will perform 10-fold Cross Validation for all our experiments. This is done so that, no matter how the data is divided, we always compute the average.
Prediction and Evaluation:
Finally, we predict the outcome (cyberbullying or non-cyberbullying) on the test data of the data set using the trained model and evaluate the model on the basis of evaluation techniques such as Precision, Recall and F-score.
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Data generation from tweet binder API licence | Equipment | 1 | 41000 | 41000 |
| Deployment cost Window azure | Equipment | 1 | 28000 | 28000 |
| Introduction to python(Udemy cources) | Miscellaneous | 1 | 4000 | 4000 |
| Nature Language processing with python(Udemy cource) | Miscellaneous | 1 | 4000 | 4000 |
| Overhead | Miscellaneous | 1 | 2000 | 2000 |
| Total in (Rs) | 79000 |
COVID-19 detection is an extremely significant research theme nowadays, due to the prevail...
An automated vacuum cleaner that would suck cotton dust and clean the floor and air for av...
Our project is a pulse doppler signal processor for air surveillance. Project O...
Power generation is the one of the main concerns throughout the globe. Research is...