Analysis on Twitter Data using Big Data Analytics and Machine learning Techniques
Every second, on average, around 6000 tweets are tweeted on Twitter, which corresponds to over 350,000 tweets sent per minute, 500 million tweets per day and around 200 billion tweets per year. Twitter made it
2025-06-28 16:30:14 - Adil Khan
Analysis on Twitter Data using Big Data Analytics and Machine learning Techniques
Project Area of Specialization Cloud Infrastructure,Project SummaryEvery second, on average, around 6000 tweets are tweeted on Twitter, which corresponds to over 350,000 tweets sent per minute, 500 million tweets per day and around 200 billion tweets per year. Twitter made its live datasets public so that different processing techniques could be apply on it for fast processing. We use Big data analytics on Twitter to fetch its live streaming tweets using tweepy API every hour and perform Big data analytics on it with a specific hash tag name (e.g: #Imrankhan , #NayaPakistan etc) passing it through all the important tools of Big data Analytics i-e Apache PIG , Apache Hive , Apache Spark and much more. We will also perform Sentiment Analysis on those tweets i-e If the tweets are positive, negative and neutral about that specific hash tag name with the help of Text-Blob, SentiWordNET and W-WSD algorithms and determining popular hash-tags using hadoop platform and Spark platform.
Project Objectives- Big Data Analytics on Twitter live streaming Data using Tweepy.Api passing it all the important tools of Hadoop as well as spark.
- Sentiment Analysis using 3 different algorithms i-e TextBlob,SentiWordNet and W-WSD algorithms.
- Determining Popular Hashtags.
In this Project, We use waterfall model. It is also referred to as a linear-sequential life cycle model. It is very simple to understand and use. In a waterfall model, each phase must be completed before the next phase can begin and there is no overlapping in the phases. Our requirements are Static which is that we know our requirements

- We get a clear overview about the sentiment of tweets that can help us to analyze the overall result.
- We get the words per hour count and popular hashtags through hadoop that can clearly show how big data techniques can solve very big problems in minutes.
After all the work done on Twitter's live streaming Data using Big Data Analytics a dashboard representation will be shown using Micro Strategy Desktop Application.The Dashboard will show different Graphs representing Words Per Hour count with 3 other different graphs of Sentiment analysis each containing different algorithms used and a Popular Hashtaq graph representation on that specific Hashtag.
Final Deliverable of the Project Software SystemType of Industry IT Technologies Big DataSustainable Development Goals Industry, Innovation and InfrastructureRequired Resources| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Total in (Rs) | 10000 | |||
| Big Data Analytics Course | Miscellaneous | 2 | 5000 | 10000 |