TOP PRODUCT HUNTING FOR E-COMMERCE WEBSITE THROUGH CONSUMER COMMENT SENTIMENT ANALYSIS
The main purpose or objective of our project is the comparison between multiple products and which of them suits to be the best among them. We all know that the correct or perfect product hunt can be difficult so our project would assist the user in the most optimal way by providing the list of prod
2025-06-28 16:29:51 - Adil Khan
TOP PRODUCT HUNTING FOR E-COMMERCE WEBSITE THROUGH CONSUMER COMMENT SENTIMENT ANALYSIS
Project Area of Specialization Artificial IntelligenceProject SummaryThe main purpose or objective of our project is the comparison between multiple products and which of them suits to be the best among them. We all know that the correct or perfect product hunt can be difficult so our project would assist the user in the most optimal way by providing the list of products categorized best to worst on the basis of comment sentiment analysis. The importance of comment analysis while determining the top best products is really important as the comment highlights the customer’s opinion regarding that particular product. Secondly, this can help the owner of the company to identify real or fake products on their website. For example if a parson gave 5 star review to a product but mentioned in the comment that the delivery service was not right or improper which can decrease the quality of that particular product. Because if a person who wants something urgent needs to know about this demerit. The main reason to explain the example is that if a product with 5 star rating or reviews has a very bad result based on the analysis performed with respect to comments shows or proves that either the product is fake or reviews are bought. Lastly, The above mentioned problems can be a great hindrance when it comes to buying the perfect product. So our main motive is to provide the user or customer with the best product available on the website by organizing them or categorizing them from best to worst by applying comment based sentiment analysis. Furthermore, this analysis can also help organizations measure the Return on investment of their marketing campaigns and improves their customer service. Since sentiment analysis gives the organizations a sneak peek into their customer's emotions, they can be aware of any crisis that's to come well in time and manage it accordingly.
Project ObjectivesThe objectives of the thesis are shown as following:
- Save customer time to choose the right product.
Explanation: Our products will be compared with the other products to display the top best products in terms of review analysis. To facilitate the customer by saving their time to read through the prolonged section of comments, despite of their high ratings.
- Comparison between Products.
Explanation: Our project aims to compare reviews of a single product with the reviews of multiple products of the same domain. The product which will gain the most positive reviews will be displayed at the top thus it will distinguish that product from the other competing products.
- Distinguishing Fake and Real Products.
Explanation: The aim of the project is to differentiate and evaluate the products on the basis of their authenticity. The products sold and displayed to the users will present their authentication to the users on the basis of the valid reviews analyse.
- Help Companies Expectations.
Explanation: The sentiment analysis conducted on our products will not only facilitate the customers but to the companies and the manufacturers of these products. It will help the companies/manufacturers to improvise their products which are about to be launched with the help of the reviews and expectations previously been provided on the launched products
Project Implementation MethodThe steps involved in the methodology are as follows. We will discuss each one of them in detail;
Research Work:
We did a plenty of research work by ready, studying plenty research article all of which are added into the literature review. We also watched a couple of videos on how to do or apply classification model and divide the data into train and test set.
Dataset collection:
The dataset which will be used in our project is created through a collection of footwear products taken from official sites of Pakistani Footwear Brands, which are divided in five categories which includes Flats, Sandals, Casuals/Sneakers, Heels, Sportswear. The dataset is an excel spreadsheet which includes a Product ID, Brand, Product Name, Price, Reviews, Ratings, Category, Size, Color and an Image data fields.
- Product Id
- Name
- Color 37
- Review
- Rating
- Brand
- Categories
- Price
In this phase we worked on the following things:
- Column duplication.
- Analyzing data
- Removing redundancy from crows
- Remove column inconsistency
- Missing values
- Exploratory data analysis and Graphical representation.
Data Pre-processing:
In this phase we worked on the following things:
- Hyperlink or HTML tags
- Punctuations
- E-mojies
- Translations
- Interjection
- Non English words
- Lemmatization
- Stemming
- Stop words
- Brand name
- Greeting words
- Word cloud
Sentiment Analysis:
- Read file
- Tf-idf vectorization
- Word cloud
- Subjectivity
- Polarity
- Sentiment score
- Rating score
Label assigning :
We will be assigning labels on the basis of difference. First find the difference between sentient score and rating. Then see the difference between sentiment score and rating store in the form of percentages to have a good idea with product is fake or authentic. Now we have seen that the majority of the data has about 25 to 35 percent difference which is quite normal and justified so we are assigning the label fake to the product 42 whose difference between the sentiment and rating score more than or equal to 40 percentages else authentic. The fake product that is, those products with more that 40% difference. According to this criteria the number of fake products are approx. 127. The authentic product that is, those products with less that 40% difference. Approximately 323 shoes are authentic and falls below the 40% threshold. Then we will assign the labels of recommended to the customer or not recommended to the customer on the basis of a set threshold. If the difference in percentages b/w rating score and sentiment score is less than 25% then the product is recommended to the user.
Classification:
Now coming to the classification phase we have classified the data on the basis of two labels that is recommended not recommended or authentic or fake. We have selected SVM as a classification model as it has the best accuracy when dealing with textual data and can be shown in the literature review table.
The last phases Aspect based Research on Sentiment Analysis, Topic Model.
Benefits of the ProjectDirect Benefits:
- Save customer time to choose the right product as our web link would display the top best products in terms of the review analysis.
- Customer can view all the products of a certain category at the same time as all the products would be on the same web page limiting the deciding time in buying a product.
- The web link would help in reaching new customers as it is providing people the facility to choose the best product for themselves on the basis of the sentiment, rating score provided.
- Increasing the rate of remote access, as our project will be in the form of a web link which could be accessed easily on laptops, tablets and smartphones at anywhere, anytime.
Indirect Benefits:
- Our project aims to perform an unbiased comparison between the products as the product with the most positive reviews as compared to the other products, will be displayed as the top best product.
- Our project provides the facility to distinguish between a fake and an authentic product on the basis of the valid reviews analyze.
- Improving products and companies expectations with the help of the reviews/comments given on the launched products.
- Will provide assistance to the companies/manufacturers in the manufacturing of new products as with the help of rating and sentiment score, the companies could identify which features of the launched products they should add and improvise in the upcoming products.
The tools required for this project are mentioned below we may use some of them as per required by our project:
- VS code
- Jupyter Notebook
- Colab
- HTML5,CSS, React JS and JS
- Python
The technologies used during the completion of our FYP Project and the area of specialization in which we will work are as follows:
- Natural Language Processing
- Text Analysis
- Machine Learning
- Sentiment Scores to the Entities
- Topics
- Themes
- Categories within a Sentence or Phrase
- Deep Learning
- Mining of Unstructured Data for Opinion and Emotion 50
- Deep Learning Algorithms
- Data Mining
- Mining/Filtering of Data as per user requirement
- Data Mining Algorithms on Big Data
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Total in (Rs) | 55000 | |||
| RAM | Equipment | 3 | 5000 | 15000 |
| SSD | Equipment | 3 | 6000 | 18000 |
| Battery | Equipment | 2 | 3000 | 6000 |
| Mouse | Equipment | 3 | 2000 | 6000 |
| Printing (Proposal, Mid and Final Report, Log Book Mid and Final) | Miscellaneous | 1 | 5000 | 5000 |
| Internet Connectivity | Miscellaneous | 1 | 3000 | 3000 |
| Data | Miscellaneous | 1 | 2000 | 2000 |