Understanding and Detecting fake reviews of Google Play store applications
Nowadays, many companies are shifting their businesses online due to the growing trend among customers to buy and shop online, as people prefer online purchasing of products. Users share a vast amount of information about products, making it difficult and challenging for the end-users to make certai
2025-06-28 16:29:53 - Adil Khan
Understanding and Detecting fake reviews of Google Play store applications
Project Area of Specialization Artificial IntelligenceProject SummaryNowadays, many companies are shifting their businesses online due to the growing trend among customers to buy and shop online, as people prefer online purchasing of products. Users share a vast amount of information about products, making it difficult and challenging for the end-users to make certain decisions. Users read through the reviews before deciding to download an app, similar to buying other products on the Internet. Most companies use fake reviews in their applications for rating growth. So, we are going to do identify and understand fake reviews
The Google play stores include an increasing amount of user feedback in form of app ratings and reviews. Research and recently also tool vendors have proposed analytics and data mining solutions to leverage this feedback to developers and analysts, e.g., for supporting release decisions. Research also showed that positive feedback improves apps’ downloads and sales figures and thus their success. As a side effect, a market for fake, incentivized app reviews emerged with yet unclear consequences for developers, app users, and app store operators.
Fake reviews threaten the integrity of Google play store. If real users don’t trust the reviews, they probably will refrain from reading and writing reviews themselves. This can result into a problem for app store operators, as app reviews is a central concept of the app store ecosystem. Fake reviews can have negative implications for app developers and analysts as well. Numerous software and requirements engineering researchers studied app reviews, e.g., to derive useful development information such as bug reports or to understand and steer the dialogue between users and developers. Further, researchers and more recently tool vendors suggested tools that derive actionable information for software teams from reviews such as release priorities and app feature co-existence. None of these works considers fake reviews and their implications. Negative fake reviews, e.g., by competitors reporting false issues, can lead to confusion and waste of developers’ time. Positive fake reviews might also lead to wrong insights about real users’ needs and requirements. So, in our project, we aim to study fake reviews, their providers, characteristics, and how well they can be automatically detected
1. To find that properties of the corresponding app and reviewer are most useful to determine if a review is fake.
2. To identify differences between fake and official reviews.
3. To develop a well-trained, fine-tuned, and compared multiple supervised machine learning approaches.
In our project, we will be following mixed method approach.
To kick off we will be adopting the model proposed by [5] in his study to review fake reviews for Appstore. We will be using the same data model Google play store to identify fake reviews on its app. Phases of Adopted model: The model adopted for our project consist of three different phases i.e., Data Collection Phase, Data Preparation Phase, and Data Analysis Phase.
Phase1: Data Collection Phase
For this study we collected two datasets: an official reviews dataset including app metadata and reviews from the Apple App Store; as well as a fake reviews dataset including metadata of apps affected by fake reviews, and fake reviews itself.
? Official reviews dataset
We will use Google-Play-Scraper provides APIs to easily crawl the Google Play Store for Python without any external dependencies. By analyzing the text of user reviews, we can gain insight into what people like and don’t like about an app. Various field of Natural Language Processing (NLP) such as Sentiment Analysis and Topic Modeling can help with this, but not if we don’t have any reviews to analyze!
Before we get ahead of ourselves, we need to scrape and store some reviews.
We will use the google-play-scraper APIs to crawl the Google Play Store to get the following information:
Step 1: Obtain App IDs
Step 2: Installs and Imports
Step 3: Scrape the Reviews
Step 4: Put the Reviews into CSV file
? Fake reviews dataset
We will create fake reviews dataset by following two steps.
• First, we will identify fake review providers by performing a structured manual Google web search.
• In the second step, we will conduct a disguised questionnaire to collect initial indicators for fake reviews such as the minimum star-rating and length.
Phase 2: Data Preparation Phase
In this phase, we will clean the data to remove any missing fields etc.
Phase 3: Automatic Classification Phase
We will be using high-level machine learning and artificial intelligence techniques will be used to answer the research question to identify fake reviews
Users will be able to trust every app.
Competitors growth cannot be lose.
Technical Details of Final DeliverableAn appication which will be able to find out fake reviews among it's reviews.
Final Deliverable of the Project Software SystemCore Industry ITOther IndustriesCore Technology Artificial Intelligence(AI)Other TechnologiesSustainable Development Goals Decent Work and Economic GrowthRequired Resources| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Total in (Rs) | 70000 | |||
| Laptop | Equipment | 1 | 70000 | 70000 |