A trip to the grocery store is usually a simple in-and-out 15 minute exercise. But for people who are blind or have limited vision, it can be a major chore. Our aim is to help visually impaired people shop independently by creating machines that can interpret a complex visual scene much as the human
ASSISTIVE TECHNOLOGY FOR BLIND WITH FOCUS ON GROCERY SHOPPING.
A trip to the grocery store is usually a simple in-and-out 15 minute exercise. But for people who are blind or have limited vision, it can be a major chore. Our aim is to help visually impaired people shop independently by creating machines that can interpret a complex visual scene much as the human brain does.
One of the most important aspects of any technology is how the user interacts with it. Having an intuitive, simple, and functional interface can often be the difference between a successful, widely adopted device and one that is not.
INTERFACES
For our assistive technology, we employ two main modes of providing feedback and guidance to the user. These modes are auditory and tactile. To provide this feedback to the users, we use the glove and the glasses.
SMART GLASSES
The sensors for obstacle avoidance and speaker for audio feedback are installed in the glasses that would help the user to wander across the mart avoiding all the hindrances..
CUSTOM GLOVE
The custom glove that we employ has both a camera and a series of vibration motors. This camera that is on the glove allows the system to have the viewpoint of what the person is reaching out to hold. This viewpoint may be different from that of the camera mounted on the headset, and it is critical to provide guidance all the way to physically picking up the intended product. The attached vibration motors allow the system to provide subtle feedback to the user to convey which direction he or she should move a hand to be able to grab the desired product. An example of this would be buzzing the right motor to indicate a rightward motion or the top motor to indicate that the person needs to lift the hand
In our system, the auditory feedback combined with the haptic feedback from the glove provides the needed assistance to the shopper.
1) To identify objects in a pantry including misplaced items.
2) To identify obstacles while walking through the store.
3) To locate packaged objects from the grocery shelf and picking them up.
4) To assist in identifying items in prepared food sections
.
Our project will consist of a set of Smart Glasses and a Smart Glove.The situation will be in which a person is standing infront of an aisle.The smart glasses will help him incase he gets too close to the aisle or he is about to bump into something.The sensors will detect and give audio and haptic feedback.
Coming to the glove part it consists of a a set of speakers,microconntroller with the product,food and currency detector in it with a camera.When the person wants to know what's infront he places the glove and on the backend the model is selected if the item is a product,the product detector runs and if a food the food detector runs.He gets the output in both cases through audio.Upon selecting the item he takes out money and places the glove's camera on it.Then the currency detector runs,he get an audio output of the amount and he makes a successful transaction preventing fraud.
1) The people who are blind would be benefitted from our prototype on individual basis.
2) The shopping marts can buy this prototype in order to assist their visually impaired customers.
3) We can also introduce this technology at blind schools and the welfare organizations working for the visually impaired people.
FOOD RECOGNITION
For food recognition (fruits, vegetables etc) inception-v3 a pretrained convolution neural network trained on ImageNet dataset capable of classifying 1000 classes is used. Inception v3 is a widely-used image recognition model that has been shown to attain greater than 78.1% accuracy on the ImageNet dataset. The model itself is made up of symmetric and asymmetric building blocks, including convolutions, average pooling, max pooling, concats, dropouts, and fully connected layers. Batchnorm is used extensively throughout the model and applied to activation inputs. Loss is computed via Softmax
Real time image classification and speech model is divided into three submodels Webcam image feed,Image classification and Text to speech.
1.WEBCAM IMAGE FEED
For this module, Opencv is used. The function cv2.VideoCapture () needs to be called, and the incoming frames are read. The .read () method in the script is a blocking operation, so the main thread of the Python script is completely blocked until the frame is read from the camera device and returned to our script. This is a problem, as is critical for our system to run in real time. It can improve the FPS (frames per second) simply by creating a new thread that does nothing but pulls the camera for new frames while our main thread handles processing the current frame.
2. IMAGE CLASSIFICATION WITH INCEPTION
For this part, some code from the Tensorflow image classification tutorials was re-adapted. The model was downloaded and a node lookup class created for getting the human readable class from the result.
3. TEXT TO SPEECH
Google TTS (Text-to-Speech) API through Gtts is used, and saving each result into a look-up table, to get a better real time performance.
4. PUTTING ALL TOGETHER
All the previous modules are called and DNN is made ready. Some variables are declared, as the score and the color and font for the text on screen.The Tensorflow session is started and the model is loaded.We count the number of frames just for getting an estimation of the FPS (not real as we only process the image every 5 frames, so we have performance peaks on this points). We set the threshold to 50% of confidence.
PRODUCT RECOGNITION
A Python Application is developed to capture image using Camera then apply OCR to recognize the text.OCR (Optical Character Recognition) systems transform a two dimensional image of text, that could contain machine printed or handwritten text from its image representation into machinereadable text.
CURRENCY DETECTOR
A real time Pakistani currency detector is developed on tiny YOLO v3 using a custom dataset of Pakistani notes that'll help the blind to make successful transactions using an audio output of each note so that he might not be a victim of fraud
SMART GLASSES
The Smart glasses provide the system with sensors for the obstacle detection and speakers for audio feedback The audio is a set of commands.
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Nvidia Jetson Nano | Equipment | 1 | 30000 | 30000 |
| Power supply for Jetson Nano | Miscellaneous | 1 | 1500 | 1500 |
| Micro SD | Miscellaneous | 1 | 1500 | 1500 |
| Acrylic case | Miscellaneous | 1 | 1500 | 1500 |
| Cooler for Jetson Nano | Miscellaneous | 1 | 2500 | 2500 |
| Metal case | Miscellaneous | 0 | 0 | 1500 |
| Arduino UNO with accessories | Equipment | 1 | 5000 | 5000 |
| Arduino nano with accessories | Equipment | 1 | 3000 | 3000 |
| camera | Equipment | 1 | 10000 | 10000 |
| speakers | Miscellaneous | 0 | 0 | 0 |
| Sensors(esx-10n) | Equipment | 2 | 10000 | 20000 |
| Total in (Rs) | 76500 |
Project Summary The You?re Fit App is a BMR and BMI mobile application which avoids m...
Vehicle Security is the recent trend in the field of embedded systems and plays a major ro...
Artificial intelligence market in the field of healthcare is growing quite rapidly worldwi...
In the world of today, a major change in technology can be seen as an advantage, a number...
In the current time power has become necessary for everything to run. Power has become a n...