Speak it. Compose it.
Sentiments are feelings, emotions, likes and dislikes or opinions which can be articulated through text, images or videos. Sentiment Analysis on web data is now becoming a budding research area of social analytics. Image recognition, image prediction, image sentiment analysis, and image clas
2025-06-28 16:29:37 - Adil Khan
Speak it. Compose it.
Project Area of Specialization Artificial IntelligenceProject SummarySentiments are feelings, emotions, likes and dislikes or opinions which can be articulated through text, images or videos. Sentiment Analysis on web data is now becoming a budding research area of social analytics. Image recognition, image prediction, image sentiment analysis, and image classification are some of the fields where Neural Network (NN) has performed well implying significant performance of deep learning in image sentiment analysis. This project will focus on some of the techniques from which we can extract sentiments from a video or image. Talk shows are the prime hubs for heated discussions and we often see these dialogues getting out of hand where the participants usually get ruled by their emotions. This creates a chaotic and unpleasant environment which fails to deliver a constructive discussion. Our system tends to ease out such circumstances by notifying if the participants are feeling the emotion of anger and prompt you before so that things don't escalate to an undesirable extent. This is done by analyzing videos and images of talkshow, extracting sentiments of participants and notifying them to remain calm if anger is detected. For this purpose, we aim to utilise Convolutional Neural Networks (CNN) for their exceptional accuracy rate in image processing and emotion detection. CNN worked even better if it progressed to Attentional Neural Networks or Progressive Convolutional Neural Networks.
Project ObjectivesDue to sensitivity of the topics being discussed in live talk shows, it is inevitable to not provoke an unpleasant argument. The goal of this project is to help anchor persons conduct a smooth talk show, a talk show not governed with anger and abuse but by logical discussions. This will be accomplished by the following:
-
Frame by frame processing of video for facial recognition and feature extraction.
-
Anger detection through visual sentiment analysis.
-
Alert signal for all participants if discussions turn to heated arguments.
-
Identification of the person most responsible for the unsettling atmosphere.
This project employs the use of the concepts of Computer Vision and Artificial Intelligence. The work will be done in Python mostly as it is a good platform for AI libraries. As for the handcraft features such as color histograms and Local binary patterns, they are also made use to get more accuracy. Furthermore, for fully effective training of neural network models, an extensive dataset is required which we can achieve by utilizing Data Augmentation. We will be using fer2013 dataset too train oiur model, furthermore we wil work on sentiment ectraction from videos by extracting frames from a video.
Benefits of the ProjectOur project provides a solution for every type of problem which require to analyze emotions from a certain video or images. Every type of media platform which has discussion between people like podcasts can utilise this tool which can notify the host of the show whenever anyone is getting angry through sentiment analysis. Its basically simple tool based on models present to maintain a comforting environment for a discussion to take place in a talk show.
Technical Details of Final DeliverableOur model will train on the dataset. The dataset will contain images or videos of persons displaying various emotions. Videos will be converted into images by extracting image frames. When passed through CNN’s, the emotions of these images will be detected and the result will be displayed to the user.
In the pre-processing phase, coloured images will be converted to RGB type and image enhancement will be done in order to remove noise or edges from images. For this purpose, the algorithms that will be availed are mentioned in the next section.
After pre-processing, faces will be detected from images and salient parts of face will be extracted. These facial features will then be passed to CNN for sentiment analysis. The training of CNN is performed on a part of the dataset. The CNN will test and predict sentiment of extracted features. The result of CNN will be used to generate an appropriate signal if the sentiment detected is not the desired one i.e anger. This signal or warning generated will be used to soothe the environment.
Final Deliverable of the Project Software SystemCore Industry MediaOther Industries Others , Telecommunication Core Technology Artificial Intelligence(AI)Other Technologies Big DataSustainable Development Goals Industry, Innovation and InfrastructureRequired Resources| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Total in (Rs) | 70000 | |||
| Camera | Equipment | 1 | 10000 | 10000 |
| Laptop | Equipment | 1 | 60000 | 60000 |