Today, the estimated population of Pakistan is approximately 228 million, out of which approximately 10 million people are hearing impaired. The hearing-impaired people find hard to fit in with the local community due to their disability. A deaf and dumb person is unable to experience his/her surrou
Virtual scene rendering for deaf people using AI based audio analysis
Today, the estimated population of Pakistan is approximately 228 million, out of which approximately 10 million people are hearing impaired. The hearing-impaired people find hard to fit in with the local community due to their disability. A deaf and dumb person is unable to experience his/her surroundings, creating a boundary with the world of sounds.
This project tends to bridge this gap by using Artificial Intelligence (AI) based analysis of the sounds and virtual rendering scenes accordingly on the screen of a wearable device that represent the surrounding environment for the deaf people. This helps the deaf and dumb in recognizing the sounds to which they are alienated.
The deliverable of the project is a wearable device, capable of processing the surrounding audios signals with AI and displaying the virtual scenes to the hearing-impaired people. The audio data of voices surrounding our environment is collected. Noise is removed and speech features are extracted using software techniques. Machine learning techniques are used for the recognition of audio signals. After recognition the virtual scenes are displayed in the screen of the wearable device.
The objectives are:
The project is implemented in four steps. First three steps are software-based steps and the last step is development of wearable device for the project.
In the first step, audio data of surrounding voices is gathered. The data includes daily life voices such as dog barking, jack hammer, car horn, engine idling, drilling, street music, construction sounds, gunshot noise, air conditioner sound, Ambulance siren and children playing.
In the second step, speech analysis of the collected data is performed. It consists of processes like noise removal and speech feature extraction. The speech feature extraction is based Mel Frequency Cepstral Coefficients (MFCC) technique. The technique splits the given audio into frames and compute MFCC for each frame.
In the third step various Machine Learning (ML) technique are trained for audio signals recognition. For this purpose, first the dataset is divided into disjoint train-test sets. Training is performed on 70% data (train set) and Testing on remaining 30 % data (test set). The ML techniques used are KNN, Support Vector Machine, Random Forest, Logistic Regression, Decision Tree and Naïve Bayes. All these techniques are evaluated on the test data to identify the best one for the recognition of surrounding audio signals
In the fourth step the wearable device is developed. It consists of Raspberry pi 4, a display screen and a microphone. The Raspberry pi 4 is programmed with the best Machine Learning technique and the Mic connected to it fetches the real-time audio. Based on ML judgments, the virtual screen is displayed on the screen.
The project is developed to help the hearing-impaired people:
The deliverable of the project is a wearable device, capable of processing the surrounding audios signals with AI and displaying the virtual scenes to the hearing-impaired people.
The procedure entails gathering audio data from nearby speakers and noise, extracting speech features, and training with Machine Learning (ML) algorithms for audio signal recognition.
Finally, a wearable device is used to gather audio signals from the surrounding area and to display a virtual scene on the wearable device's screen in response to an ML-based judgement on the surrounding voice.
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Raspberry pi 4 | Equipment | 1 | 47000 | 47000 |
| 3.5-inch RPI Display | Equipment | 1 | 14000 | 14000 |
| Memory card(32GB) | Equipment | 1 | 2800 | 2800 |
| Mic | Equipment | 1 | 3000 | 3000 |
| Sound card | Equipment | 1 | 2000 | 2000 |
| Adapter (5 volts) | Equipment | 1 | 1200 | 1200 |
| Miscellaneous | Miscellaneous | 1 | 10000 | 10000 |
| Total in (Rs) | 80000 |
Unified Power Quality Conditioner (UPQC) is an integration of Series and Shunt Active Powe...
A system in which have all international countries university information. Students don't...
?AutoSecure? System will provide a platform where the firearms of unauthorized persons are...
The Smart Control Wheelchair is a sum of all transportation solutions for handicapped...
For this project, I used Niklas Rosenstein's Myo-Python library that he kindly shared on G...