A study of content based spam SMS detection for local languages
A spam message is generally any unwanted message that is sent to user?s mobile phone. Spam messages include advertisements, free services, promotions, awards, etc. In this project we try to find the best possible solution for filtering spam SMS messages in local languages. we will use many different
2025-06-28 16:30:07 - Adil Khan
A study of content based spam SMS detection for local languages
Project Area of Specialization Artificial IntelligenceProject SummaryA spam message is generally any unwanted message that is sent to user’s mobile phone. Spam messages include advertisements, free services, promotions, awards, etc. In this project we try to find the best possible solution for filtering spam SMS messages in local languages. we will use many different frameworks.The one that best solves the problem will be chosen.
Project Objectives- To study whether machine learning techniques can be utilized to distinguish local language spam SMS messages from normal messages
- To evaluate and compare the efficiency of different machine learning techniques for content based spam SMS detection. The one that best solves the problem will be chosen.
Initially, an application will be made and distributed to people. Through this application data will be gathered; specific messages marked as spam by receiver’s local language and which sources are considered unwanted. Sources may include, but not limited to, Network Providers, Companies and Marts. Once data is gathered, a content-based study on those messages will be carried out to study whether a machine learning technique can be utilized to distinguish local language spam SMS from normal messages. Multiple frameworks will be used to evaluate Machine Learning. The one that best solves the problem will be chosen.
Benefits of the Project- It will be identified whether machine learning techniques can be applied on roman local languages in Pakistan, and if so, a framework, that can best solve the issue of spam messages in roman local languages in Pakistan, will be identified.
- The framework identified will be efficient in performance compared to other framworks.
The final package that is created will contain full details of whether it is possible for Machine Learning to distinguish between a spam and a non-spam message in roman urdu language and other local languages in Pakistan. The many techniques that are applied will be compared among each other to find the most efficient one.
Final Deliverable of the Project Software SystemType of Industry Others , Security Technologies Artificial Intelligence(AI), OthersSustainable Development Goals Sustainable Cities and Communities, Responsible Consumption and ProductionRequired Resources| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Total in (Rs) | 18500 | |||
| Google Play Store Account | Equipment | 3 | 4500 | 13500 |
| Advertising the app | Equipment | 500 | 10 | 5000 |