Adil Khan 11 months ago
AdiKhanOfficial #FYP Ideas

Multiple Voice Separation with Speaker Diarization

Speech Separation is a special scenario of source separation problem and a challenging task. We will present a method to separate a mixed audio sequence, in which multiple speakers speak simultaneously. The classification model will estimate the number of speakers and will train different models for

Project Title

Multiple Voice Separation with Speaker Diarization

Project Area of Specialization

Artificial Intelligence

Project Summary

Speech Separation is a special scenario of source separation problem and a challenging task. We will present a method to separate a mixed audio sequence, in which multiple speakers speak simultaneously. The classification model will estimate the number of speakers and will train different models for each speaker. We will evaluate our model under both clean and noisy data. The expected results will be that our model will separate multiple voices and then additionally it will transcribe the speech into text and display the transcribed results with speaker diarization on our website.

Project Objectives

The project is aimed at developing a software that will help individuals and businesses separate and identify voices from any audio. It is also aimed at converting the audio of speakers in form of text. We are expecting our blind source separation model to predict separate multiple voices with at least 50% accuracy and we will try to increase its accuracy as well.

Project Implementation Method

Major application components include

  • Dataset Collection
  • Data Pre processing
  • Training
  • Validation
  • Prediction
  • Web Application

Benefits of the Project

In terms of business, our goals are:

  • To improve the quality of voice assistance.
  • To use the software in IoT devices.
  • To be used in live streaming devices to translate the audio to text.
  • To help identify the speaker of the audio.

Technical Details of Final Deliverable

This project aims to build a model that will separate multiple voices/signals from a mixed-signal and transcribe the text with speaker diarization.

Final Deliverable of the Project

Software System

Core Industry

IT

Other Industries

Education

Core Technology

Artificial Intelligence(AI)

Other Technologies

Sustainable Development Goals

Industry, Innovation and Infrastructure

Required Resources

Item Name Type No. of Units Per Unit Cost (in Rs) Total (in Rs)
Gtx1650 Equipment17000070000
Total in (Rs) 70000
If you need this project, please contact me on contact@adikhanofficial.com
Portable Fundus Photographic Machine and Retinal Vessel Segmentation

Many common diseases such as hypertension, diabetes, macular degeneration, glaucoma, eye c...

1675638330.png
Adil Khan
11 months ago
Pi-Health: Digital Skin Disease Detection based on ML

In recent years, with the rapid development of computer-aided diagnosis (CAD) systems deep...

1675638330.png
Adil Khan
11 months ago
IoT Based Smart Home System

IoT home automation is the ability to control domestic appliances by electronically c...

1675638330.png
Adil Khan
11 months ago
Smart Attendance System with Facial Recognition

Our project will digitize the process of attendance collection by using image processing t...

1675638330.png
Adil Khan
11 months ago
Dynamic Wireless Electric Vehicle Charging Road

As the world is moving from conventionally fueled vehicles to EVs, because of being enviro...

1675638330.png
Adil Khan
11 months ago