SummItUp Content Summary Generator
In today?s day and age, information is in excess and time is limited. SUMM-IT-UP is a web application that generates accurate, coherent and concise extractive summaries of single/multiple documents, without restrictions of domain specificity, on the basis of concept extraction using an ensemble of N
2025-06-28 16:36:11 - Adil Khan
SummItUp Content Summary Generator
Project Area of Specialization Artificial IntelligenceProject SummaryIn today’s day and age, information is in excess and time is limited. SUMM-IT-UP is a web application that generates accurate, coherent and concise extractive summaries of single/multiple documents, without restrictions of domain specificity, on the basis of concept extraction using an ensemble of Natural Language Processing techniques. We’ve managed to achieve accuracy much higher than state-of-the-art approaches in multi document summarization and very comparable results in single document summarization.
Project ObjectivesThis project was developed keeping in mind the problem of excess information and limited time to process it. The aim of this project was to create an application that provides users with relatively accurate extractive summaries in a short span of time, almost instantaneously in real time. The model developed should be light weight such as to be deployed on web.
Project Implementation MethodThe project consited of two modules, the summarization model and web app. Both modules were created using Python as the primary programing language. Bootstrap, flask and vanila js were used to develop the web app and integrate it with the summarization model.
The model was developed focusing on the content and context of the data. It consists of the following sub modules: data cleaning and preprocessing, coreference resolution, sentence encodation, clustering, ranking and assembling of selected sentences. An ensemble of NLP tools and techniques were used to create it. Libraries such as tensorflow, scikit-learn, genism, NLTK etc were used in the process.
Benefits of the ProjectThis project was prove to be very benificial in all walks of life. Regardless of what profession the user belongs to, this application has the ability to summarize any content to produce a concise version of the actual document. As a result saving time, effort and resources.
Technical Details of Final DeliverableThe project can be divided into two modules: summarization model and web app. Python was the primary programing language for development of both modules.
The model was developed focusing on the content and context of the data. It consists of the following sub modules:
- Data Cleaning and Preprocessing
- Coreference Resolution
- Sentence Embeddings Generation: For this a skip thoughts encoder and decoders were trained on wikipedia 2016 articles dump.
- Clustering
- Ranking based on contextual similarity and Assembling of selected sentences
An ensemble of NLP tools and techniques were used to create it. Libraries such as tensorflow, scikit-learn, genism, NLTK etc were used in the process.
Final Deliverable of the Project Software SystemType of Industry IT Technologies Artificial Intelligence(AI)Sustainable Development Goals Decent Work and Economic GrowthRequired Resources| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Total in (Rs) | 65000 | |||
| Printing | Miscellaneous | 300 | 10 | 3000 |
| Poster | Miscellaneous | 1 | 2000 | 2000 |
| GPU For Training | Equipment | 1 | 60000 | 60000 |