Adil Khan 11 months ago
AdiKhanOfficial #FYP Ideas

An Editor to Autocomplete sentences in Urdu using Natural Language Processing

Natural language processing (NLP) is the intersection of computer science, linguistics and machine learning. The field focuses on communication between computers and humans in natural language and NLP is all about making computers understand and generate human language. Autocomplete feature

Project Title

An Editor to Autocomplete sentences in Urdu using Natural Language Processing

Project Area of Specialization

Artificial Intelligence

Project Summary

Natural language processing (NLP) is the intersection of computer science, linguistics and machine learning. The
field focuses on communication between computers and humans in natural language and NLP is all about
making computers understand and generate human language. Autocomplete feature is widely used in search
interfaces to assist users in their search. Autocomplete helps users by giving list of options based on characters

entered in the search field. Our work will lay down a foundation for text prediction of an inflected and under-
resourced Urdu language. The interface application will be used in different apps, which will predict sentences

after typing initial words. It is based on the N-gram model. An N-gram model is built by counting how often
word sequences occur in corpus text and then estimating the probabilities. Recently, Gmail has been rolling out
a new feature called "Smart Compose" that suggests complete sentences in your emails to save your precious
fingers from actually having to type the whole thing out. Similarly, our intention is not only limited to emails,
but we will take it to story writing, also it is going to be beneficial for the columnist to write articles and blogs in
the Urdu language as most people are not fluent in Urdu typing.

Project Objectives

The objectives of our project are:
? To collect dataset using web scrapping techniques from different Urdu websites and different Urdu textbooks.
? Pre-processing of dataset.
? To develop a deep learning model which can be trained on data containing Urdu text.
? Build a decent user interface for editor, that take few initial words and auto complete the sentence.

Project Implementation Method

• Tools: The most suitable programming language for our project is Python as it is the leading coding language for
NLP because of its simple syntax, structure, and rich text processing tool. We will use Visual Studio Code and
Jupiter Notebook as IDE. Working with python in VS Code, using the Microsoft Python extension, is simple, and
productive. The extension makes VS Code an excellent Python editor.

• Modeling and Training: An efficient model will be chosen after testing different models such as Generative N-
gram language model, RNN and Instance-based sentence completion. These models provide a natural approach to

the construction of sentence completion systems. As predictive analysis is totally based on the accuracy of model
that will be predicting the values, these given training model are being used now a days and then the most
accurate one is selected after comparative analysis.

Benefits of the Project

Following points are describing the benefits of our project:
? From this project, we aim to help out the writers who feel difficulty in writing the Urdu Language by offering
tailored suggestions for completing your sentences as you type.
? Also, as most people are not fluent in the Urdu Keyboard, we aim to provide a service that can save your time and
typos.
? In academics, we have many Urdu textbooks, our project can help in reducing typing time.
? We have many Urdu Novels in our country that is being published on monthly basis or many daily Urdu Newspapers, it will
help in this area also.

Technical Details of Final Deliverable

The final deliverable will be the machine learning model which would be trained on urdu language to autocomplete urdu sentences. The machine learning model used will be N - gram. There will be an editor inorder to showcase the machine learning model.

Final Deliverable of the Project

Software System

Core Industry

IT

Other Industries

Core Technology

Artificial Intelligence(AI)

Other Technologies

Sustainable Development Goals

Industry, Innovation and Infrastructure

Required Resources

Item Name Type No. of Units Per Unit Cost (in Rs) Total (in Rs)
NLP Coursera Specialization Miscellaneous 080000
Total in (Rs) 0
If you need this project, please contact me on contact@adikhanofficial.com
0
101
Automated invigilator assignment system

A System, consists of a mathematical model, a database storing the information and web-bas...

1675638330.png
Adil Khan
11 months ago
Advance Bio-floc System with Cloud Based Mobile Application

An overall literature analysis of a project is presented in this chapter. Throughout all t...

1675638330.png
Adil Khan
11 months ago
National rular support programme

The National Rural Support Programme (NRSP) is a national entity whose mandate is to...

1675638330.png
Adil Khan
11 months ago
Smart agriculturing using UAVs

Smart agriculture is a management concept focused on providing the agricultural industry w...

1675638330.png
Adil Khan
11 months ago
Voice based chatbot for e commerce website using NLP

Communciating with customers through live chat interfaces has become increasingly pop...

1675638330.png
Adil Khan
11 months ago