Virtual Assistant for blind people

Virtual Blind Assistance is being developed to guide blind persons via voice instructions in indoor and outdoor  environment. The assistance will detect objects, recognize them and guide the blind person to plan his path. The objects to be detected and recognized include house furniture, humans

2025-06-28 16:36:36 - Adil Khan

Project Title

Virtual Assistant for blind people

Project Area of Specialization Artificial IntelligenceProject Summary

Virtual Blind Assistance is being developed to guide blind persons via voice instructions in indoor and outdoor  environment. The assistance will detect objects, recognize them and guide the blind person to plan his path. The objects to be detected and recognized include house furniture, humans and walls , vehicles , and other things  

The blind person wear glasses in which camera is attached also that  person have hand free and a mobile phone so that when a person see anything from camera send  the input to your mobile phone then mobile phone process it and give the audio to the hand free by this method blind person know the things that is in front of him

Project Objectives Project Implementation Method  

During the Implementation phase, the project plan is put into motion and the work of the project is performed. It is important to maintain control and communicate as needed during implementation. Progress is continuously monitored and appropriate adjustments are made and recorded as variances from the original plan.

Hence as part of execution we broke our Virtual Assistance to modules for a broad view of the project and to strategize easily for further implementation and tracking through out.

Modules

Developing your own Voice Assistant App

For now we will be using custom voice assistance provided to us by the gTTS, STT or TTS free libraries provided open source, for later on broad production and our own identity we might be developing our personal assistant voice api.

Voice/speech to text (STT)

This is the process of converting speech signal into digital data (e.g., text data). The voice may come as a file or a stream. You can use CMU Sphinx for its processing.

Text to speech (TTS)

This is the process of converting digital data (e.g., text data) into speech signal. The voice may come as a file or a stream. We might use gTTS for its processing.

This process translates text / images in a human speech. It is very useful when, for instance, a user wants to hear the correct pronunciation of a foreign word.

Intelligent tagging and decision making

Intelligent tagging and decision making serve for interpreting the user's request. For example, in our current perspective the user may ask: 'I want to reach certain destination'. The technology will tag few appropriate path and suggest you a few according to your interests. We are overseeing multiple APIs for our certain needs.

Image recognition

Image recognition is the most important part of our virtual assistant project and the mighty module we are focusing on, we’ve considered and have been working on OpenCV because of the bigger community size that could help us in any point of error. We will be considering deep learning model be practiced and matching accuracy of our need in OpenCV through image recognition/ object detection/ classification, clustering/ or mapping.

Benefits of the Project

Virtual assistants can make it feel like secretaries have come back in style. Some examples includes for major perspective for Blinds are:

Technical Details of Final Deliverable

In the final deliverable we will integrated system in which Bluetooth glasses , Bluetooth hand free and mobile are attach to each other through hand free , and mobile is connected to a cloud server

, so we will show following functionality during final testing of the project

A blind person wear a glasses along with the hand free and a mobile phone in his pocket, so when person start walking the things that can see by camera will detect using Open CV and then that result will send to mobile and then mobile process it through Cloud server using deep learning algorithms and then cloud server tell the result of the things to the mobile and mobile send that result to the hand free using text to speech and blind understand the  

It tell about the following things:

here is the diagram for our final Deliverable  : 

Virtual Assistant for blind people _1582919863.png

Final Deliverable of the Project HW/SW integrated systemType of Industry IT Technologies Artificial Intelligence(AI), Internet of Things (IoT), Cloud InfrastructureSustainable Development Goals Good Health and Well-Being for People, Partnerships to achieve the GoalRequired Resources
Item Name Type No. of Units Per Unit Cost (in Rs) Total (in Rs)
Total in (Rs) 59450
Jetson nano Equipment12100021000
SD card 32 GB Equipment1800800
Charger Equipment1250250
Camera (8 MP) for jetson nano Equipment180008000
Audio speaker for jetson Equipment1800800
Bluetooth camera glasses Equipment12100021000
Cloud servers Equipment150005000
Bluetooth Hand free Equipment1600600
Document printing Miscellaneous 120002000

More Posts