Adil Khan 9 months ago
AdiKhanOfficial #FYP Ideas

Text Reading Assistant for visually impaired people

The idea of this project is to develop a wearable and affordable speech assistant for visually impaired people which converts typed or printed text into machine-encoded text with the implementation of Artificial Intelligence-based Models,  it then converts the recognize text into audio out

Project Title

Text Reading Assistant for visually impaired people

Project Area of Specialization

Artificial Intelligence

Project Summary

The idea of this project is to develop a wearable and affordable speech assistant for visually impaired people which converts typed or printed text into machine-encoded text with the implementation of Artificial Intelligence-based Models,  it then converts the recognize text into audio output and assist the user (Visually Challenged or Blind). 

Blind people are unable to perform visual tasks. For instance, text reading requires the use of a braille reading system or a digital speech synthesizer (if the text is available in digital format). The majority of published printed works does not include braille or audio versions, and digital versions are still a minority. Thus, the development of a device that can perform the image text to speech conversion has great potential and utility. By using different Artificial Intelligence Based Models, a Raspberry Pi based assistive device for visually impaired people is proposed. The technology of optical character recognition (OCR) enables the recognition of text data and the technology of speech synthesis (TTS) enables a text in digital format to be synthesized into human voice and played through an audio system.

Project Objectives

• To detect text from Natural Scene as well as documents and papers with much accuracy.

• To extract the appropriate ROI of detected text and performing Image Processing.

• To Implement Optical Character Recognition (OCR) and recognizing simple words. 
• To make OCR more efficient by recognizing text from documents as well as boards of different size.

• To remove as much noise as possible and obtain a machine-coded text.

• To implement TTS Method in order to convert machine-coded text into Voice using Raspberry Pi.

• To make the system more efficient by reading different words and sentences.

• To develop a systematic way of performing each task with a proper procedure that can actually assist a Visually Impaired Person.

• To design and develop a device that can be carried by Visually Impaired People with the connection of the Camera in an appropriate place. 

Project Implementation Method

 There are different methods and techniques involved in the whole procedure. The very first step is to Detect the Text from any given document, paper, boards or natural scene using Deep Learning Model. If the text is detected, then the next step of the system is to extract the ROI of each text and pass it for further processing of recognizing the text using Optical Character Recognition (OCR) with the help of another Deep Learning Model for Text Recognition. Once the text is recognized properly it will be converted into machine-encoded text and by applying the Text-To-Speech (TTS) Method, the text will be converted into Audio and can be listened with an Output Audio Device.

The technology of speech synthesis (TTS) enables a text in digital format to be synthesized into human voice and played through an audio system. The objective of the TTS is the automatic conversion of sentences, without restrictions, into the spoken discourse in a natural language, resembling the spoken form of the same text, by a native speaker of the language. 

OCR is used in machine processes such as cognitive computing, machine translation, text to speech, key data and text mining. It is mainly used in the field of research in Character recognition, Artificial intelligence and computer vision. The whole process will be performed using Raspberry Pi and Python as a programming language. 

Benefits of the Project

The number of Visually Impaired people worldwide is approximately 285 million, in other words, more than 3.86% of the entire population. Around two million people are blind in Pakistan and around 6 million people are partially blind in this country Visual Impairment makes life rather difficult for people who suffer from this health problem, but the use of technology can help in some day-to-day tasks to get information from a text, for which a person required to have visibility.

Speech is probably the most efficient medium for communication between humans. Thus the implementation of a wearable and affordable device for Visually Impaired persons converting written text to audio output can be the solution for offering a more independent lifestyle to them, enabling them to be able to read newspapers, books, online content, product details, medicine, boards etc. and much more in Future Scope. 

Technical Details of Final Deliverable

The project has two milestones i.e. Text detection and Text recognition.  
Text Detection: 
Text detection is done using EAST i.e. a deep learning-based algorithm that detects text with a single neural network with the elimination of multi-stage approaches. The EAST algorithm uses a single neural network to predict a text on a word or line. For quadrilateral forms, it can detect text in arbitrary orientation. The complete convolutionary network is used to locate text in the image and essentially this NMS stage is used to consolidate several imprecise text boxes identified into a single bounding box for each text area (word or line text).
We have set up the environment and image processing for EAST with OpenCV and Python.  

Text Recognition: 
Text recognition involves the following protocol: 
OCR: Optical Character Recognition is the mechanical or electronic conversion of pictures of typewritten, handwritten or printed text into machine-encoded text. It’s the simplest methodology of digitizing printed and handwritten texts, so they'll be simply searched, hold on a lot of succinctly, displayed and altered online, and employed in numerous different process tasks like language translation and text mining. The proposed system has been developed using neural OCR in OpenCV which recognizes the text in the document. 

Python-tesseract: Python-tesseract is an optical character recognition (OCR) tool for python. That is, the text embedded in images will be recognized and "read." 
It is also useful for tesseract as a stand-alone invocation script, as it can read all image types provided by the Pillow and Leptonica image libraries, including jpeg, png. Alternatively, if used as a script, Pythontesseract can print the recognized text instead of writing it to a file. 

 Text Extraction: The perceived content in the scanned image is separated using OCR engines at this stage Here we utilize the Tesseract OCR engine which separates the recognized characters.
 
TTS: Text to speech (TTS) converts text into voice by means of a speech. Text to speech systems was first developed to help the visually impaired by offering the user a spoken voice created by a computer which would "read" text. We have attained pyttsx3 as our TTS synthesizer. IT comprised of two main phases:

• Text processing  • Speech generation

Software Specification: 
The OS under which this project executed for now is Windows. The algorithm is constructed through the use of the python script language. OpenCV is an open-source computer vision library that is composed under C and C and continues to operate under Linux, Windows and Mac OS X. OpenCV has been designed for computational productivity with a strong focus on ongoing applications. 

Final Deliverable of the Project

HW/SW integrated system

Core Industry

Health

Other Industries

Core Technology

Artificial Intelligence(AI)

Other Technologies

Wearables and Implantables

Sustainable Development Goals

Good Health and Well-Being for People

Required Resources

Item Name Type No. of Units Per Unit Cost (in Rs) Total (in Rs)
Raspberry Pi 4 4GB for moveable model Equipment11400014000
Raspberry Pi Zero for glasses Equipment150005000
Pi Camera Original Equipment130003000
Pi Camera ribbon Equipment110001000
Moveable Wooden Prototype Model Equipment125002500
Power Source for Raspberry Pi (5V 3 Amp Power Bank to make Device Port Equipment130003000
Glasses, Packing of Device, Earphones Miscellaneous 115001500
original scan disk microSB card 64 GB Equipment215003000
printing Miscellaneous 115001500
Transportation Miscellaneous 120002000
Total in (Rs) 36500
If you need this project, please contact me on contact@adikhanofficial.com
Smart Lab Security System

.The idea is to develop an android and desktop application to monitor the computer and ass...

1675638330.png
Adil Khan
9 months ago
Three Phase IPM Based Motor Driver

In this project we generate 3phase of 310Vpp by intelligent power module (IPM) to drive a...

1675638330.png
Adil Khan
9 months ago
Contact less Palm print Authentication System

Biometrics features can be used for authentication purpose in computer-based security syst...

1675638330.png
Adil Khan
9 months ago
Autonomous Weapon Detector

The system is more likely a security system which will help reduce the security risk in ba...

1675638330.png
Adil Khan
9 months ago
dMechanics Android App

Online Mechanics Application The purpose of this application is to provide automotive serv...

1675638330.png
Adil Khan
9 months ago