According to the World Health Organization, around 40 million people in the world are blind, while another 250 million have some form of visual impairment. About 2 million people are blind in Pakistan. This TTS device can be a great help for people with any learning difficulties too. Accordi
Punjabi text to speech system with support for visually impaired
According to the World Health Organization, around 40 million people in the world are blind, while another 250 million have some form of visual impairment. About 2 million people are blind in Pakistan. This TTS device can be a great help for people with any learning difficulties too. According to specialists, impaired vision can have negative effects on learning and social interaction. It can affect the natural development of intelligence and academic ability, social, and profession. People who are visually impaired cannot be recovered with the help of glasses. People with low vision cannot even see the normal printed paper. They can only see if the sizes of the characters or letters are big enough. This condition impacts the length of the reading process and makes the eyes tired. To help improve the quality of life for people with low vision a tool to read the article is needed. The rate of vision impairment can vary in each individual with low vision. Therefore a device developed in this work utilized other sensory functions in receiving information from a text. The device is specifically designed for people with low vision. So that they can easily use this device without having to ask for help from others and they can utilize this device for academic and intelligence ability.
People who suffer from low vision, sight and visual impairment are not able to see words and letters in ordinary newsprint, books and magazines clearly. This can make the reading process difficult, which can disturb the learning process and slow the person's intelligence development. Therefore, a device is needed to help them read. So we are developing one such device that can scan and read any kind of text by changing it to voice message. The purpose of this device is to process the input Image, pdf, Documents, Textbooks, and News Papers as input into a voice as output. Each Module for image processing and voice processing are going to be present in the device. It will also have the ability to play and stop the output while reading. It also has less error rate and less processing time and cost efficiency. Raspberry pi 3 is being used to develop the device. This device is actually going to act as an artificial eye to visually impaired people. It doesn't need any human supervision. There are not any linguistic resources available for the Punjabi language(Shahmukhi Script). We have collected 24 hrs of speech data against which we have written text to train the model. We also maximize the quality of the output speech while taking minimum input from the user.
TTS (Text-to-Speech) is one of the main components of human machine interaction. As the name suggests, the text-to-speech system converts text into spoken sound and thus, the machine (like a robot) can interact with us(users). Our project has two major parts, Design and implementation of Text to speech system for Punjabi language(Shahmukhi) and Text-to-Speech Device for Visually Impaired People.
A. Image Correction module:
a) Gray Scaling:
It is a process of converting a digital or pixel image into a gray scaled image.
b) Binarization:
It is a process of converting a grayscale image into a binary image.
c) Image Processing module using Optical Character Recognition:
This module consists of OCR or Optical Character Recognition. It targets typewritten text, one glyph or character at a time.
B. TTS Correction and Voice Module
In TTS correction and voice module text is converted to speech. The output of OCR is the text, which is stored in a file (speech.txt). Here, Flite and Espeak software are used to convert the text to wav format. Finally text.wav can be heard. Flite and Espeak are open source software that can be implemented to Raspberry pi, which is available in many languages. TTS (Text-to-Speech) is a system that can convert input from text into speech.
Along with other languages already we are also implementing the Punjabi Text to speech system designed by ourselves for reading the text. As there isn’t any TTS system made for the Shahmukhi script of Punjabi language. So for our own implementation of TTS of Punjabi language we are going through these two stages.
The first stage is the processing phase of Text, in which the inserted text is typed representing phonetics. This process is based on the natural language processing method (NLP). Another stage is the production of sound wave format from phonetic representation. Our TTS system is based on ideas presented by Tacotron. Tacotron is an end-to-end synthesizer. Its concept is based on synthesizing straight from the character. The model takes text to be synthesized as input and converts it into character embedding. The waveforms in the model are computed from an algorithm.
Data Collection:
For the corpus collection part we have got some speech data of Punjabi language from Radio Pakistan. But to make our system more accurate we are also recording high quality audios of our own too. For the text part we are writing all of that on our own.
Language is one of the most important parts of any culture. Punjabi is the 8th most spoken language in the world and 40% of Pakistan's population speaks it(wikipedia). Punjabi is written in two scripts, Gurmukhi and Shahmukhi. It’s written using Gurmukhi script in most parts of India and Shahmukhi script in Pakistan. Already multiple Punjabi Text-to-Speech systems have been developed for Gurmukhi script but none for Shahmukhi has already been developed. Our contribution is in Designing and Implementing a TTS for Shahmukhi script of Punjabi language.
Text-to-speech systems can help those with literacy difficulties, learning disabilities, reduced vision and those learning a new language.
Some people are auditory learners, some are visual learners, and some are kinesthetic learners – most learn best through a combination of the three.
We can extend the reach of our content by making it available both in text and spoken form. It also opens doors to anyone looking for easier ways to access digital content.
TTS eases the internet experience for the 1 out of 5 people who have dyslexia, low literacy readers and others with learning disabilities by removing the stress of reading and presenting information in an optimal format.
This is a portable device and it does not require internet connection, and can be used independently by people with low vision, visual impairment. This device also has a user interface that allows people to interact easily.
to or download the audio file against it.
The device is designed based on the following restrictions:
a) Range of reading distance is 15-30 cm.
b) Character size is a minimum of 8 pt.
c) Maximum size of reading material can be varied.
d) Maximum tilt of the text line is 5 degrees from the Vertical
e) Type of characters includes Roman, Egyptian or Sans Serif types.
The module is being designed in such a way that no physical equipment or stand-like structure is used to carry the pi cam module, as it is placed using two L-clamps over the encasing of the board. The pi camera lens is adjusted, in order to acquire the script sharply. The distance between the cam module and script is between 15 to 30cm, the minimal distance that a human eye needs to read a script.
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Raspberry pi 4 Model B 8GB | Equipment | 1 | 28500 | 28500 |
| 5MP Raspberry Pi 4 Camera | Equipment | 1 | 5000 | 5000 |
| braille keyboard | Miscellaneous | 1 | 8000 | 8000 |
| Soundcore Mini 3 Portable Bluetooth Speaker | Equipment | 1 | 10500 | 10500 |
| Bootable SanDisk Ultra 16GB micro SD Card(for testing) | Miscellaneous | 1 | 1000 | 1000 |
| Urdu Keyboard | Miscellaneous | 1 | 1000 | 1000 |
| Sony TX Series Digital Voice Recorder (TX660)(for quality corpus ) | Equipment | 1 | 26000 | 26000 |
| Total in (Rs) | 80000 |
For a decade the autonomous car has been the headline of the news and still continuously d...
This project " Incubator Control and Monitoring System? presented by SAIF ALI KH...
The object of this malware analysis in the sandbox environment is to safe the personal com...
smart Healthcare monitoring system is iot based project and this project first periority i...
There is no such site in Pakistan for pet accessories, veterinarians, therapies, group d...