HearSmart is Pakistan?s first smart hearing solution for people, with (or without) hearing impairment, who have trouble focusing on specific sounds/voices. It would let them hear what they want to hear by cleaning the noisy input and isolating speakers from a mixture of noise and sounds.
HearSmart
HearSmart is Pakistan’s first smart hearing solution for people, with (or without) hearing impairment, who have trouble focusing on specific sounds/voices. It would let them hear what they want to hear by cleaning the noisy input and isolating speakers from a mixture of noise and sounds.
Hear Smart would be using deep learning to isolate voices from a mixed audio source containing various noises, including multiple speakers. To accomplish this, it would only need to listen to the target person speaking .The recorded audio would then be quickly processed in the cloud, where a neural network will learn to extract the target’s voice. Once training is complete, the neural network model would send the results back to the system, enabling instantaneous on-device voice isolation.
This project will benefit people who face difficulty in hearing in noisy environment, or cannot understand speech when too many people start talking at once. It could also be helpful to people working at environments like factories, airports etc. by allowing them to filter out unwanted noises.
The system first takes input (noisy) from the environment, extracts features (Spectral and Spatial) out of it and passes the feature vector to a DNN (as input).
We train this deep neural network to learn the spectral mapping from reverberant, or reverberant and noisy signals to
The cIRM (Complex Ideal Ratio Mask). The DNN is given the complementary set of features. The input is normalized to have zero mean and unit variance. After normalization, auto-regressive moving average (ARMA) filtering is performed on the input features. The output layer of the DNN is divided into two sublayers. The sublayers are for the real and imaginary components of the cIRM. Linear activation functions are used in the output layer, whereas rectified linear functions are used in the hidden layer. Back propagation based on the mean-square error is used to train the DNN.
The output of the DNN is an estimate of the compressed mask values of the cIRM.
At each noisy entry, DNN suggests a mask to remove the noise. This mask, when multiplied with noisy input, returns a De-noised speech signal.
This de-noised speech is then passed through Speaker Separation Algorithm and returns separated speech sources available. The user can pick up the speaker of its own choice and suppress the others to have clear hearing.



| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Intel NCSM2450.DK1 Movidius Neural Compute Stick | Equipment | 2 | 10615 | 21230 |
| Raspberry Pi 3 Model B+ | Equipment | 1 | 18200 | 18200 |
| LED HP EliteDisplay E231 | Equipment | 1 | 8000 | 8000 |
| Others(Electronic components etc) | Miscellaneous | 1 | 10000 | 10000 |
| Amazon Web Services (EC2) | Equipment | 1 | 19665 | 19665 |
| Total in (Rs) | 77095 |
Silicon oil is very useful for cooling and insulation properties in transformer but these...
Scope, Introduction and Background of the Project Each and every Muslim can train themselv...
Among large number of advancement done in medical sector, very few actually focus on...
With the increasing reliability and cost effectiveness of?Internet of Things (IoT) based c...
In this project, we introduced a vision-based vacancy parking area detecting method for NE...