Voice Pathology Detection
Voice pathology is increasing dramatically, especially due to unhealthy social habits, being too much talkative, age factor or some kind of pathology in throat. Normally the people who speaks a lot for example, teachers or announcers etc. suffer from voice pathology in their elderly age. We will dev
2025-06-28 16:36:41 - Adil Khan
Voice Pathology Detection
Project Area of Specialization Artificial IntelligenceProject SummaryVoice pathology is increasing dramatically, especially due to unhealthy social habits, being too much talkative, age factor or some kind of pathology in throat. Normally the people who speaks a lot for example, teachers or announcers etc. suffer from voice pathology in their elderly age. We will develop a research-oriented simulation project, which will help a general physician to identify this voice pathology. It is work of specialist ENT doctor. This project will help general physician to identify pathology and to refer the patient to the specialist doctor. In order to achieve our goal, we will extract commonly used features of speech sample and then train the system using some machine learning techniques that can provide us with better pathology detection rate. This project is research oriented. We will provide analysis of different techniques that will be used in this project.
Feature Extraction is the process of obtaining useful information of the signal while discarding unwanted information. This paper presents the development of a Voice Pathology Detection model for detecting and classifying Saarbrucken Voice Database (SVD). Researchers have been using several Features Extraction techniques to assist a general physician and ENT doctor in diagnosing of Voice Pathology. Classification is important tool used for making prediction created on given datasets. In this paper, we have used classification to predict the presence of Voice Pathology in patients. We used different Machine Learning algorithms for the detection of Voice Pathology. For research purpose, we learnt the Saarbrucken Voice Database from German site machine learning repository and trained it on WEKA tool.
Automatic voice pathology detection system can be used by general physicians for early diagnosis. In the case of a voice disorder, they can refer the patient to an ENT specialist for further investigation. Automatic voice pathology detection systems (AVPDS) are easy to use and are non-invasive in nature. The biggest advantage of developing AVPDS is to use it in remote areas, where specialized clinics are not available. In this era of technology, one can detect pathology and its different types using machines, where the system is trained to detect the pathology. Since voice has a strong relation with the vocal cord, we can detect pathology through voice. Specifically, the size of the vocal cord directly affects the pitch of the speech. Vocal fold opens while breathing in or out, whereas it vibrates while singing or speaking. Various machine learning algorithms and feature extraction techniques have been proposed for medical applications; AVPDS is such an application. There are two main categories for speech features; the first category is based on hearing system and the other on speech production system. The most commonly used features for AVPDS is Mel-frequency cepstral coefficient (MFCC).
Project ObjectivesThere are different datasets and different machine learning techniques for the implementation of this simulation project. Many researchers have used same or local dataset but different machine learning techniques to proposed a better model. I will use benchmark dataset with different feature extraction and machine learning techniques to verify and compare my result with previous research work. I will validate these results by proposing a system that can identify voice pathology for general physicians.
Project Implementation MethodIn voice pathology classification system, the main goal of the feature extraction step is to compute a sequence of feature vectors providing a compact representation of the given input signal. This project has two parts. First part is Feature Extraction, we will extract commonly used feature which are used for voice. Second part is modeling, which is the part of machine learning, we will use one or two different modeling techniques just to make a model for pathology detection. In this project we will use two models, one for normal voice and the other for pathological voice. In the end, we will check the test samples for both models and analyze the result.
Features used for classification:
We use a commonly used feature in our research which is Mel Frequency Cepstral Coefficient (MFCC). Research shows that MFCC give better results as compared to other features in the result because MFCC works well in speaker recognition, speech and accent recognition.
This project has three parts:
First part is finding the best dataset and reviewing it with the previous research work.Second part is Feature Extraction, we will extract commonly used features which are used for voice. Third part is modeling, which is the part of machine learning, we will use one or two different modeling techniques just to make a model for pathology detection. In this project we will use two models, one for normal voice and the other for pathological voice. In the end, we will check the test samples for both models and analyze the result.
Benefits of the ProjectThis project is for general physician in medical. So, they can identify voice is pathological or not with the help of this simulation. In most of the cases, general physician cannot diagnose a pathological voice because he is not specialized in specific field. During diagnosing of patient, a general physician can only tell whether this a normal throat problem i.e. coughs or not. If he is suffering from normal throat problem then the physician will give him medicine. otherwise As, per their detection they can refer the patient to specialized doctor. We are automating this diagnosing problem so that the patient can get a proper treatment of their throat disease.
Technical Details of Final DeliverableMel Frequency Cepstral Coefficient (MFCC):
The MFCC technique is used for deduction of noise in voice signal and also used for voice classification, speaker identification and speech related domain. These coefficients try to analyze the vocal tract independently of the vocal folds that can be damaged due to voice pathologies. These coefficients try to analyses the vocal tract independently of the vocal folds that can be damaged due to voice pathologies. In this work, the experiments were conducted using 13 MFCC coefficients.

In voice pathology classification system, the main goal of the feature extraction step is to compute a sequence of feature vectors providing a compact representation of the given input signal. This project has two parts. First part is Feature Extraction, we will extract commonly used feature which are used for voice. Second part is modeling, which is the part of machine learning, we will use one or two different modeling techniques just to make a model for pathology detection. In this project we will use two models, one for normal voice and the other for pathological voice. In the end, we will check the test samples for both models and analyze the result.
