Our project is based on machine learning. We will analyze the sentiments that user gave in comments of any amazon product reviews. This is basically natural language processing. We will get labelled data which has two features. First one is review comment the next is the label (positive or negative)
Sentiment Analysis on Product Reviews: Amazon
Our project is based on machine learning. We will analyze the sentiments that user gave in comments of any amazon product reviews. This is basically natural language processing. We will get labelled data which has two features. First one is review comment the next is the label (positive or negative). We will preprocess data and apply feature extraction and convert words in vector representation. Then we will train our model and evaluate it and select the best parameters for it. Then we will integrate this model in website platform in which user enter the URL of any product. Our web scrapping model will fetch the reviews of that product and then predict the sentiments of users by applying our trained model and then our platform will show the result to user. In this Project, we will make sense of the sentiment expressed in product review section of Amazon. We will help the seller through our model that how customers are feeling about their product, in this way seller can improve the area of product which needs to be improvised or think of ways to make it better for customer use. On customer point of view, Customer will know what are the general sentiments of the product are.
Some key deliverables before Phase – I are:
Phase I (Machine Learning Model)
First we have to download a supervised data available on the internet. So that our learning be supervised. There are numerous sources from where data can be gathered but we came across a dataset of 1.6 Gb that was not specifically directed towards one category of the products rather it was kind of a universal type of data with entries in train data more than 10 lac and in the test data it was around or maybe near 05 lac. The link to the data is given as https://figshare.com/articles/dataset/Amazon_Review_Polarity/13232501/1
Cleaning the data is not a simple step it requires time, effort and energy. Also some of the feature extraction techniques is also in this step in our case is the vocabulary of words. Now that we have downloaded the data now our next step would be to clean the data in the most effective way.
We can use any pre-trained models but we’ll use Google’s word2vec model because as per internet research we came to know that the google model is very well trained than any other model also it can be one of the best source of transfer learning this model is also robust and perform wells on any of the natural language processing dataset. This will require high computation and processing power to load this much data into the model.
The next step is the model selection. So we have decided to use the “Recurrent Neural Network”.
Then the model will be divided into two part train and test 80% will be used in training and 20% be used in testing.
Model will be saved on the local machine.
Phase II (Web Scrapping Model)
2.1 Web Scrapping
We have scrape the reviews data from Amazon.com. One will write a program that queries internet servers, requests and retrieves information, parses it to extract info and stores it in a csv file.
2.2 Preprocessing
After the data is being scraped the next is step is to clean the data. If any missing values etc are present then remove it. As Scrapped data may be unstructured or not good for the model so clean and check the data before feeding into the model. Save the data in csv.
Phase III (Website building and ML - Model Integration)
3.1 Let the input of saved model be the scrapped csv file
3.2 Dashboard creation using flask or Django and website building
Tools and techniques
Python, Numpy, Pandas, Jupyter lap, Tensorflow, beautiful soup, flask, web hosting , a workstation with high computation and processing power and enough space in memory to save and run the model
By using real-time scrapped reviews data from Amazon product review section, we analyse sentiments of reviews to decide the outcome in the form of beautifully designed dashboard showing percentage of positive, negative, or neutral reviews. We also observed the potential impact of reviews on selling trends. The RNN model is methodical in making prediction on the time-series data. Also, we will help the seller to improve the areas of the product which are not liked by the general audience that is buyer in our case.
Our model will advise both the buyer and the seller of the product. For instance, if a product has many positive reviews then our machine learning model will advise the buyer to buy this product and will not recommend to buy the product if product has many negative reviews.
While, on the other hand from seller point of view it will advise him to upgrade certain area of his service or product to be improved so that he/she can improve his/her sale and meet customer expectations.
Our Web app will use the “Recurrent Neural Network” at backend as a machine learning model. Out of many models, RNN are supposed to be the one that can be used in providing us better prediction results. The unique aspect of NLP data is that there is a temporal aspect to it. Each word in a sentence depends greatly on its context. In order to account for this dependency, we use a recurrent neural network.
Purpose of the project is to optimize performance of product in the global market and driving traffic to ones’ product in Amazon Listing on online portal, by introducing a state-of-the-art sentiment analysis model on product reviews section.
Main driving force behind our end software product will be the Web Scraping Model which will scrape and extract the reviews of our client’s product dynamically whenever a new review is added to the product reviews section. We will use the most powerful web scraping libraries of python like selenium, beautifulSoup4 and Scrapy. By scraping the product reviews from amazon and labelling the reviews as (positive, negative or neutral) we will be all set to train our main machine learning Sentiment Analysis Model on the basis of the data gathered in the scraping step. On the basis of the training data given to the model, our model will predict the overall product’s reviews sentiments. The full and final deliverable of our product will be the full fledge graphical user interface designed and integrated with our trained machine learning model which will be available for use by the professional Amazon product sellers who will insert the URL of their Amazon product and our model will scrape the reviews and then the model will predict about the product sentiment and deliver a detailed report to the seller to assist them to sell their product better on Amazon. This is the ready to use system and the most vital deliverable in this project.
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| 512 GB SSD | Equipment | 1 | 15000 | 15000 |
| intel Core i-7 8th gen Processor | Equipment | 1 | 45000 | 45000 |
| 16 GB Ram DDR4 3200 MHz | Equipment | 1 | 10000 | 10000 |
| Web Hosting | Miscellaneous | 1 | 5000 | 5000 |
| Co Lab paid cloud functions | Miscellaneous | 1 | 5000 | 5000 |
| Total in (Rs) | 80000 |
Analysis of grid fins for missile systems is to be done using computational fluid dynamics...
A sustainable eco-friendly solution for a digitalized holistic waste management. Disruptin...
Pakistan has still not fully utilized the low head potential of its river,canals, conduits...
Every Game that we play do not involve the human physical involvement so we will try to ma...
mmWave 5G has brought a revolution in the field of communications with its large band...