Natural language processing (NLP) is the intersection of computer science, linguistics and machine learning. The field focuses on communication between computers and humans in natural language and NLP is all about making computers understand and generate human language. Autocomplete feature
An Editor to Autocomplete sentences in Urdu using Natural Language Processing
Natural language processing (NLP) is the intersection of computer science, linguistics and machine learning. The
field focuses on communication between computers and humans in natural language and NLP is all about
making computers understand and generate human language. Autocomplete feature is widely used in search
interfaces to assist users in their search. Autocomplete helps users by giving list of options based on characters
entered in the search field. Our work will lay down a foundation for text prediction of an inflected and under-
resourced Urdu language. The interface application will be used in different apps, which will predict sentences
after typing initial words. It is based on the N-gram model. An N-gram model is built by counting how often
word sequences occur in corpus text and then estimating the probabilities. Recently, Gmail has been rolling out
a new feature called "Smart Compose" that suggests complete sentences in your emails to save your precious
fingers from actually having to type the whole thing out. Similarly, our intention is not only limited to emails,
but we will take it to story writing, also it is going to be beneficial for the columnist to write articles and blogs in
the Urdu language as most people are not fluent in Urdu typing.
The objectives of our project are:
? To collect dataset using web scrapping techniques from different Urdu websites and different Urdu textbooks.
? Pre-processing of dataset.
? To develop a deep learning model which can be trained on data containing Urdu text.
? Build a decent user interface for editor, that take few initial words and auto complete the sentence.
• Tools: The most suitable programming language for our project is Python as it is the leading coding language for
NLP because of its simple syntax, structure, and rich text processing tool. We will use Visual Studio Code and
Jupiter Notebook as IDE. Working with python in VS Code, using the Microsoft Python extension, is simple, and
productive. The extension makes VS Code an excellent Python editor.
• Modeling and Training: An efficient model will be chosen after testing different models such as Generative N-
gram language model, RNN and Instance-based sentence completion. These models provide a natural approach to
the construction of sentence completion systems. As predictive analysis is totally based on the accuracy of model
that will be predicting the values, these given training model are being used now a days and then the most
accurate one is selected after comparative analysis.
Following points are describing the benefits of our project:
? From this project, we aim to help out the writers who feel difficulty in writing the Urdu Language by offering
tailored suggestions for completing your sentences as you type.
? Also, as most people are not fluent in the Urdu Keyboard, we aim to provide a service that can save your time and
typos.
? In academics, we have many Urdu textbooks, our project can help in reducing typing time.
? We have many Urdu Novels in our country that is being published on monthly basis or many daily Urdu Newspapers, it will
help in this area also.
The final deliverable will be the machine learning model which would be trained on urdu language to autocomplete urdu sentences. The machine learning model used will be N - gram. There will be an editor inorder to showcase the machine learning model.
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| NLP Coursera Specialization | Miscellaneous | 0 | 8000 | 0 |
| Total in (Rs) | 0 |
A System, consists of a mathematical model, a database storing the information and web-bas...
An overall literature analysis of a project is presented in this chapter. Throughout all t...
The National Rural Support Programme (NRSP) is a national entity whose mandate is to...
Smart agriculture is a management concept focused on providing the agricultural industry w...
Communciating with customers through live chat interfaces has become increasingly pop...