Adil Khan 9 months ago
AdiKhanOfficial #FYP Ideas

Real-time Replayed Audio Detection voice-Bot

This project aims to develop a robust voice-bot application that will take the audio as input and will determine whether the given input audio is live (bona fide / genuine) or fake (replay). We are focusing on single- and multi-order voice replay attacks detection to provide a protective countermeas

Project Title

Real-time Replayed Audio Detection voice-Bot

Project Area of Specialization

Artificial Intelligence

Project Summary

This project aims to develop a robust voice-bot application that will take the audio as input and will determine whether the given input audio is live (bona fide / genuine) or fake (replay). We are focusing on single- and multi-order voice replay attacks detection to provide a protective countermeasure layer for automatic speaker verification (ASV) systems to protect them from being compromised against the voice replay attacks at real-time.

This project covers both aspect of research and application development. We will focus on developing a robust feature descriptor and used them with either the conventional machine learning or deep learning classifier for replay detection. We will employ two standard datasets ASVspoof and the VSDC for performance evaluation.

From the development perspective, we will provide a Realtime solution for replay spoofing detection, which can indicate the nature of audio instantly in real-time to the user. The final product is a mobile application voice bot which will interact with its users through voice commands and respond with voice as well.

The proposed aELTP features are a 1D implementation of the ELTP feature set, and this implementation exploits the benefits gained in face detection in images and apply them in audio signals and specifically replayed audio spoof detection. These features result in invariance to pitch, and energy transforms and generates a very noise resistance robust framework.

Project Objectives

  1. To develop a robust and novel features-set to reliably capture the attributes of bonafide and replay audios.
  2. To develop an effective voice replay detection method that can be easily installed in resource constraint portable devices like smart speakers.
  3. Create a voice replay detector mobile application that will interact with the users through voice commands.

Project Implementation Method

We have proposed the acoustic Enhanced Local Ternary Pattern, which is an implementation of Enhanced Local Ternary Pattern, which was designed for images, and has been converted for use with 1-D audio signals. The a-ELTP is superior audio feature extraction than acoustic Local Ternary Patterns because it is invariant to energy transformations and uses a self-adjusting threshold.

In the first step, the aELTP features of the audio signal are extracted. For implementing this, first we must split our audio signal into frames, where each frame has a length of 9, which is then quantized and split into negative and positive parts. Lastly they are combined to calculate aELTP descriptors and histogram binning is performed to get a final representation of the audio.

This data is fed into a machine learning algorithm which trains and generates a model. The model is deployed inside a mobile application and IoT devices where it can give quick and accurate results.

Benefits of the Project

  1. Mobile application can be used for voice authentication and validating spoof detection systems.
  2. Working model can be deployed on small IoT devices such as smart door locks for spoof detection and security.
  3. Research results can be studied in the future as a basis for further work.
  4. Compared to older implementations, this framework hopes to provide a more lightweight and accurate feature set that can detect replay spoof attacks in a wider variety of environments.

Technical Details of Final Deliverable

A Mobile Application where users can load machine learning model, select, or record an audio file, which the application can analyze to detect whether the audio is genuine or fake(replayed) audio.

Final Deliverable of the Project

Software System

Core Industry

IT

Other Industries

Media , Security

Core Technology

Artificial Intelligence(AI)

Other Technologies

Internet of Things (IoT)

Sustainable Development Goals

Good Health and Well-Being for People, Industry, Innovation and Infrastructure

Required Resources

Item Name Type No. of Units Per Unit Cost (in Rs) Total (in Rs)
Amazon Echo Dot 3rd Generation Equipment2900018000
Google Nest Mini Equipment2850017000
BM800 Microphone Kit Equipment21150023000
BYM1 Equipment230006000
Mini USB Microphone Equipment415006000
Google Playstore Registration Miscellaneous 150005000
Documentation/Thesis/Reports/Research Paper/Stationary Miscellaneous 150005000
Total in (Rs) 80000
If you need this project, please contact me on contact@adikhanofficial.com
Smart Virtual Training Mirror

The Smart Exercise Mirror is a revolutionary mirror for exercise. It's an interactive devi...

1675638330.png
Adil Khan
9 months ago
Development of Mechanoreceptors using Nanofibers for Artificial Skin

In recent years, the significance of nanofibers in designing ultra-sensitive sensors for h...

1675638330.png
Adil Khan
9 months ago
Design and Development of a 6 Degree of Freedom (DOF) Stewart Platform...

The aim of this project is to design and develop a 6 Degree of Freedom (DOF) Gimbal for im...

1675638330.png
Adil Khan
9 months ago
Design and Implementation of Wireless Power Transfer by using Impedanc...

The main objective of this project is to develop a device for wireless power transfer. The...

1675638330.png
Adil Khan
9 months ago
Leap Motion Based Robotic Arm

As  intention  of  the  research  is  to  help  th...

1675638330.png
Adil Khan
9 months ago