Adil Khan 1 year ago
AdiKhanOfficial #FYP Ideas

Named Entity Recognition for Balochi

Balochi, being our mother language and language spoken by all Baloch people living in Pakistan and around the globe, has not been digitalized yet. By digitalization, we mean, a human language understandable by Computer. Balochi has its written format, which matches to Urdu, Arabic, and other Arabic

Project Title

Named Entity Recognition for Balochi

Project Area of Specialization

Artificial Intelligence

Project Summary

Balochi, being our mother language and language spoken by all Baloch people living in Pakistan and around the globe, has not been digitalized yet. By digitalization, we mean, a human language understandable by Computer. Balochi has its written format, which matches to Urdu, Arabic, and other Arabic scripted languages. In the process of digitalizing a language, some of the basic tasks are required after which the complex tasks can be performed, like Part of Speech tagging, Name Entity Recognition, Word Embedding, Dictionary and Translation management etc. We have chosen Name Entity Recognition, which is the process of identifying various names (Location, Address, Numeric, Date, Time) from a given text. Such kind of identifying will later help for extracting meaningful contents and values from a raw text. Named entity recognition (NER) is a sub-task of NLP. The purpose is to identify named entities mentioned in articles into predefined categories, such as a person’s name, organizations, locations, times, date, currency, and percentage. From the whole process of text analysis, NER belongs to the field of unknown word recognition. It is an important component of various NLP tasks, such as information retrieval, machine translation, and so on. The main idea behind our project is to make an AI model which identifies the named entities from the Balochi text.

Project Objectives

The main objective of our project is to digitalize the Balochi language. Means to make it understandable for machine. In the process of digitalizing a language, some of the basic tasks are required after which the complex tasks can be performed, like Part of Speech tagging, Name Entity Recognition, Word Embedding, Dictionary and Translation management etc. We have chosen Name Entity Recognition, which is the process of identifying various names (Location, Address, Numeric, Date, Time) from a given text. Such kind of identifying will later help for extracting meaningful contents and values from a raw text.

Project Implementation Method

The NER can be developed using three approaches, 'Rule-Based', 'Machine Learning, and 'Hybrid' approach. The Rule-Based system is difficult to develop as one should know the language and grammar rules. The machine learning approach provides different Statistical NLP tools to train the NER system. The hybrid approach is a combination of both Rule Base and Statistical based. We train our model by using supervised learning algorithm.

Benefits of the Project

This project is important in the context of Natural Language Processing. We are specifically going to perform NER for Balochi. The project’s outcome will help digitalizing Balochi in the modern digital world. This project will work like a helping tool later for greater NLP tasks like Information Retrieval or specific content retrieval from any given Balochi text. The project will familiarize Computer with various Name Entities of Balochi Language. Later, this model can be utilized in various other projects.

Technical Details of Final Deliverable

Our Project Named Entity Recognition for Balochi (NERB) identifies the named entities like (person, location, address, date, numerical, financial values, etc.) from the Balochi text and categorizes them into default categories. It uses a pre-defined training set. And will be monitored using a machine learning algorithm according to the training set given. After training it will take a sample text, tokenize the text, and identify the named entities from the tokens and tag them into their categories

Final Deliverable of the Project

Software System

Core Industry

IT

Other Industries

Core Technology

Artificial Intelligence(AI)

Other Technologies

Sustainable Development Goals

Quality Education

Required Resources

Item Name Type No. of Units Per Unit Cost (in Rs) Total (in Rs)
50 MB Internet Connection Equipment6523031380
Modem Equipment230006000
twisted pair cable Equipment300206000
Ethernet cable Equipment100303000
Grammarly premium license Equipment4300012000
8gb RAM Equipment2540010800
Printing Miscellaneous 101001000
Stationery Miscellaneous 101001000
Overhead Miscellaneous 613007800
Total in (Rs) 78980
If you need this project, please contact me on contact@adikhanofficial.com
0
166
Smart City

In the name of Allah, the most Gracious and the Most Merciful. Peace and blessing of Alla...

1675638330.png
Adil Khan
1 year ago
Automatic Number Plate Detection of Car

Automatic License Plate Recognition system is a real time embedded system which automatica...

1675638330.png
Adil Khan
1 year ago
Phasor Diagram Display Trainer for Single Phase Power System

In Electrical labs, various experiments are performed on different loads, when different r...

1675638330.png
Adil Khan
1 year ago
DESIGN AND DEVELOPMENT OF SMART WEEDING ROBOT USING MACHINE LEARNING

Agriculture is the backbone of our economy, In Pakistan 62% population lives rural areas,...

1675638330.png
Adil Khan
1 year ago
Automatic Three Phase Load Balancing System by Using Fast Switching Re...

In three phase distribution system the unbalance phenomenon occurs due to single-phase loa...

1675638330.png
Adil Khan
1 year ago