Twitter event detection and analysis system
Before the news is aired on TV channels, it?s already trending at various social media sites like Twitter. Witnesses to the events post their discoveries online and are picked up by news channels subsequently. However, processing such a huge amount of information to filter newsworthy text forces the
2025-06-28 16:29:52 - Adil Khan
Twitter event detection and analysis system
Project Area of Specialization Computer ScienceProject SummaryBefore the news is aired on TV channels, it’s already trending at various social media sites like Twitter. Witnesses to the events post their discoveries online and are picked up by news channels subsequently. However, processing such a huge amount of information to filter newsworthy text forces the need for software that will automatically and efficiently stream through the tweets and detect those tweets that correspond to an event. This project aims to develop a news event detection system that will detect newsworthy events from tweets.
Project ObjectivesThis research project addresses the problem of event detection from tweets. Efficient techniques are plan to filter through the large twitter stream data for noisy tweets and then use various machine learning and natural language processing (NLP) algorithms for efficient event detection.
An event is described by “what” “who” and “where”, therefore, our similarity function will be based on the use of similar hashtags, nouns, entities, and content words. For evaluation, Twitter events dataset is used that is available for download from Google Dataset Search Engine. Since the dataset only consists of tweet ids Twitter API is used to retrieve other Meta information about the corresponding tweets.
Project Implementation MethodThis project lists the algorithms and the techniques that are used to detect tweets referring to news events from social media and then generate headlines from them. Most of the techniques developed in this research will be based on Reuter Tracer's paper on real-time event detection. The implementation method include:
Fetch Tweets Using Twitter API
Clustering
Classification
Categorization
Summerization
Benefits of the ProjectThe fast detection of events is observed in this system. It will take a large number of tweets as an input and then detects any news-worthy events from those tweets. As model requires a lot of tweets, therefore Twitter Consumer API is used to get/extract tweets. After the extraction, a dynamic threshold-based clustering algorithm will be run that will cluster tweets into events that they refer to. Summarization, newsworthiness ranking, and categorization techniques will be performed to describe the events and to get ready the clusters for final headline generation. This system proves to be beneficial for news agencies and media channels who want to generate the authentic exclusive headlines
Technical Details of Final DeliverableK-mean algorithms will used to classify cluster and summerize the data.
Final Deliverable of the Project Software SystemCore Industry MediaOther IndustriesCore Technology Big DataOther TechnologiesSustainable Development Goals Partnerships to achieve the GoalRequired Resources