Inspiring from the human audio-visual system of object recognition or sound source localization demands to develop a biologically plausible attention mechanism for gaze shifting and sound source localization in a humanoid robot. By achieving a sophisticated robot's acoustic and visual system, the ab
Integrated Audio-Visual Perceptual System of Socially Interactive Humanoid Robot
Inspiring from the human audio-visual system of object recognition or sound source localization demands to develop a biologically plausible attention mechanism for gaze shifting and sound source localization in a humanoid robot. By achieving a sophisticated robot's acoustic and visual system, the ability of a robot's real-time interaction with humans can be intensified. One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting in a real-world environment. The humanoid robot’s visual system has some obstruction when the target is not in the visual field or the lighting condition is poor. A robot cannot detect a non-visual event that may be accompanied by a sound emission. Similarly, to understand the surrounding environment and compensating for the narrow visual field in robotics, auditory processing is requisite.
The social interaction of the humanoid robotic head has not achieved in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. So, there is a need to develop an auditory and visual perception system that can improve the social aspects of human-robot interaction. To perceive the worldly environment, a neural network model to detect sound origination through microphones will be developed in this FYP. The aim of the final year project is to develop a biologically inspired audio-visual integrated system that can augment the functionality of human-robot interaction. This auditory and visual localization-based system will integrate audio-visual information to improve the social interaction by detecting the cues through active audition and vision. This humanoid robot will be able to detect objects, sound origination, and effective motor control.
3. Develop an acoustic system for the humanoid robotic head.
To perform the worldly tasks, the first task is to localize the sound source. To achieve this task, an active audio perceptual system will be developed by using microphones. Next, an active visual system will be formed to extract the information from sound originating cues in a real environment. Then, the next task is to integrate the auditory and visual information from these two active systems to form a neural network that will move the head in a targeted position to make the humanoid socially interactive. This neural network model will be trained initially for various sound sources at different positions. After the training, the humanoid robot will have an audio-visual integration system similar to the humans. This whole process is shown in the figure below:

The socially interactive humanoid robot can serve a vast variety of useful purposes around the globe. It can serve on the ground and underground at a more efficient and faster rate than humans also at a place where the lives of humans are at risk. Considering the present condition of the pandemic, many lives have been lost to death. So, instead of risking a utile life, robots can serve as great companions and assistants in hospitals and other working areas. So for having the ability to understand and solve environmental problems like humans, an integrated active audio-visual perceptual system of a humanoid will serve at its best. Such a robust humanoid robot can serve as a breakthrough in every sector of Pakistan's GDP by extending the benefits in parallel to rising demands.
A socially interactive humanoid robot equipped with the replicated abilities of humans will be available as a deliverable. The developed integrated audio-visual perceptual system will be fully functional to serve as an essential part of every field of life.
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| DC Geared encoder motor | Equipment | 3 | 500 | 1500 |
| Arduino | Equipment | 1 | 700 | 700 |
| L298D IC | Equipment | 3 | 220 | 660 |
| Microphone | Equipment | 2 | 15000 | 30000 |
| Camera | Equipment | 2 | 4500 | 9000 |
| Ras-Pi-Pi4 | Equipment | 1 | 16500 | 16500 |
| Jumper wires | Equipment | 1 | 200 | 200 |
| Mechanical body | Equipment | 1 | 10500 | 10500 |
| Battery | Equipment | 1 | 500 | 500 |
| Transportation chareges | Miscellaneous | 1 | 3000 | 3000 |
| Printing | Miscellaneous | 1 | 4000 | 4000 |
| Breadboard | Equipment | 2 | 150 | 300 |
| Total in (Rs) | 76860 |
So, we are providing solution for this serious problem that we will develop a mobile appli...
English Accent Trainer is an android application that will be used to help specifically th...
Rapid industrialization and urbanization are the major problems that contribut...
Wearable electronic equipment is continually improving and becoming more integrated with t...
A cooling tower is a specialized heat exchanger in which air and water are brought into di...