Adil Khan 11 months ago

AdiKhanOfficial #FYP Ideas

ROMANIZED SINDHI TEXT FOR POS TAGGING AND SENTIMENT ANALYSIS

Artificial intelligence is advancing dramatically. It is transforming our world day by day, socially, Economically, politically. Artificial intelligence involves a variety of technologies and tools; Some of the recent technologies are: Natural Language Processing (NLP) is the intersect

Project Title

Project Area of Specialization

Artificial Intelligence

Project Summary

Artificial intelligence is advancing dramatically. It is transforming our world day by day, socially, Economically, politically. Artificial intelligence involves a variety of technologies and tools;

Some of the recent technologies are:

Natural Language Processing (NLP) is the intersection of computer science, linguistics, and machine learning. The field focuses on communication between computers and humans in natural language and NLP is all about making computers understand and generate human language.

A POS tag (or part-of-speech tag) is a special label assigned to each token (word) in a text corpus to indicate the part of speech and often also other grammatical categories such as tense, number (plural/singular), case, etc.

Sentiment Analysis is one of the most popular NLP techniques that involves taking a piece of text (e.g., a comment, review, or a document) and determining whether data is positive, negative, or neutral.

Project Objectives

Aims:

The aim of our research is to develop such a platform of Sindhi Language that helps people of different regions can easily communicate with each other.

Objectives:

To create a dataset for the Romanized Sindhi Language.
To Preprocess the data.
To apply an appropriate POS Tagger for Romanized Sindhi.
To split the dataset into train and test by applying Naive Bayes Classifier and TF-IDF.
To classify the text for Sentiment Analysis.
To find model accuracy using Naive Bayes Classifier and TF-IDF.

Project Implementation Method

Text Documentation:

The text Documentation phase shows how we collected the data for our research here we collect the group of Romanized Sindhi texts such as unigram, bigram, trigram, and N-gram words.

Data Preprocessing:

The data preprocessing phase we used to read and store the datasets and transform the raw data in a useful and understandable format.

Feature Selection:

The feature Selection phase is the most important phase of our research, in this phase, we collected the data Romanized Sindhi Text from the previous phase Data Preprocessing.

Part-Of-Speech (POS) Tagger:

Part-Of-Speech (POS) Tagger is the most important phase of our research POS Tagger will assign unique grammatical tags to every word in a collection of words.

Training Dataset:

In the Training Dataset phase, we created a dataset for the Romanized Sindhi Language, as well as trained the datasets.

Sentiment Classification:

Sentiment Classification is the main phase of our research, this phase is used to perform sentiment classification Romanized Sindhi Text as either positive sentiment or negative sentiment.

Benefits of the Project

Benefits of research Work:

To classify text either positive sentiment or negative sentiment applications are used in various applications such as

Social media monitoring

Customer feedback

Brand monitoring and reputation management

Customer Support

Product Analysis

Technical Details of Final Deliverable

The aim of our research is to develop such type of model that will help the Sindhi language in its digital existence. Our work focused on training the dataset for Romanized Sindhi language to perform Sentiment classification by using supervised Machine Learning Algorithms Naïve Bayes Classifier and Term Frequency - Inverse Document Frequency (TF-IDF) in python. We developed 500-word datasets of the Romanized Sindhi Language training set consisting of 80% of total data and the testing data consists of 20% of total data. The main problem is people of different regions can speak and understand the Sindhi language but could not read the written script in the Sindhi language because written scripts of the Sindhi language are different in various regions. There is a need for Romanized Sindhi language for an easy way to read written script of Sindhi language that will help to people of different regions can easily communicate with peoples of different regions in the written script such as e-mail, letter, Chat, etc.

When testing our datasets of Romanized Sindhi Text in python by using Term Frequency - Inverse Document Frequency (TF-IDF) Vectorization sentiment classification of text read either positive sentiment or negative sentiment after testing datasets here we find 67% accuracy of our model.

By applying Naïve Bayes Classifier sentiment classification of text read either positive sentiment or negative sentiment we find 76% accuracy of our model.

Final Deliverable of the Project

Software System

Core Industry

Other Industries

Core Technology

Artificial Intelligence(AI)

Other Technologies

Sustainable Development Goals

Industry, Innovation and Infrastructure

Required Resources

Item Name	Type	No. of Units	Per Unit Cost (in Rs)	Total (in Rs)
Samsung A72	Equipment	1	70000	70000
Binding Thesis Books	Miscellaneous	7	800	5600
Stationary	Miscellaneous	25	80	2000
Photocopy Rough work	Miscellaneous	3	300	900
			Total in (Rs)	78500

If you need this project, please contact me on contact@adikhanofficial.com

Comments 0

EasyKaam

We live in an era where everything is just one-click away. Mobile application is making li...

Adil Khan

11 months ago

Phobia Therapy Using Virtual Reality

In Pakistan, almost 25% to 30% of our community is undergoing some kind of phobia (a form...

Adil Khan

11 months ago

NUML Student Facilitation and Chatbot

NUML Student Facilitation and Chatbot is an online web based management system, This ...

Adil Khan

11 months ago

Development of fully automatic fruit grader and sorter using image pro...

Pakistan is an agricultural country as 70% of the population directly rely on agriculture...

Adil Khan

11 months ago

Customer Targeted E commerce store

In pas few year many companies like Daraz and elo have developed platform for online shopp...

Adil Khan

11 months ago