Our FYP Project; ?Document Clustering with Optimized Symbiotic Organisms Search? is a Research- based Project focusing on Document Clustering with the help of a powerful meta-heuristic algorithm called Symbiotic Organisms Search Algorithm (SOS). Project Area of Specialization is Machine L
Document Clustering With Optimized Symbiotic Organism Search
Our FYP Project; “Document Clustering with Optimized Symbiotic Organisms Search” is a Research-
based Project focusing on Document Clustering with the help of a powerful meta-heuristic algorithm called
Symbiotic Organisms Search Algorithm (SOS). Project Area of Specialization is Machine Learning,
Data/Text Mining, and Desktop Application.
Document clustering is a specialized clustering problem in which textual documents are autonomously
segregated to several identifiable subjects homogenous, and smaller sub-collections. Identifying implicit
textual patterns within the documents is a challenging aspect as there can be thousands of such textual
features.
We are proposing a soft computing approach to solve the document clustering problem; Symbiotic Organisms
Search (SOS) is a new simple and powerful nature-based search algorithm, which stimulates the symbiotic
interaction that organisms use to survive in the ecosystem, SOS uses the three phases of Mutualism,
Commensalism and Parasitism to find an optimal solution. SOS algorithm operations do not require any
specific parameters and this will further benefit us in finding the best solution. The main goal of this project is
to build the SOS algorithm and implement it to find the best optimal solution through training the model on
DOC50, NEWS20, Reuters, and WEBKB data sets.
• In this research we intend to proposed a document clustering algorithm using SOS by encoding k-means
solutions as individuals in ecosystem.
• We intend to design, implement and test our proposed approach.
• We will use standard text mining datasets for our experimentation.
• We will evaluate our proposed approach by using one internal evaluation metric (Silhouette) and one external
metric (Purity)
To get an optimal solution for the document clustering problem we are using a new simple and powerful Meta-heuristic
an algorithm called Symbiotic Organisms Search (SOS). Implementation of the SOS algorithm for numerical optimization
introduces a mathematical function named the Griewank function. This function is a multimodal, separable and regular
function with a global minimum fmin = 0 at (0, 0, . . ., 0).
The first step in SOS is the Ecosystem initialization which is initialized with the help of the Griewank expression.
The next step is identifying the best solution from the ecosystem, which would be the one with the lowest minimum
fitness value of the entire ecosystem. The next phases are mutualism, commensalism, and parasitism phase. In these
phases, an organism is randomly selected from the ecosystem and computations are made according to each phase
and fitter solutions are modified with the old values. These computations are repeated until one of the criteria is
reached; otherwise, it will go to the second step and will start the next iteration. Although the Griewank function is a bowl-
shaped, finding its global optimum is still challenging due to many local minima in the search space. According to
Whitley et al., this function is more difficult to solve in lower dimensions than in higher dimensions. Surprisingly, SOS
found the global optimum before the maximum iteration was achieved.
• Our FYP Project is based on the Symbiotic Organisms Search Algorithm which helps in Document Clustering much
more accurately and with much better Fitness Results.
• Clustering Data Accurately will help in Understanding Data much more clearly and easily.
• Use of SOS algorithm will lead to a better, fast, and high-quality clustering to reduce complexity.
• SOS algorithm clusters will make data cleaner.
• SOS algorithm does not require any specific parameters and this will further push the purity and fitness of
Document Clustering
? SOS Algorithm Using Python
? IDE: Jupyter Notebook
? Desktop Application using Python
A Desktop Application for Document Clustering working through a Metaheuristic Algorithm (Symbiotic Organisms Search) and a project Report
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| GeForce RTX 3050 Storm X 8Gb | Equipment | 1 | 70000 | 70000 |
| Project Report and Board | Miscellaneous | 2 | 5000 | 10000 |
| Total in (Rs) | 80000 |
We are designing a 2-D Punching Machine which is autonomous and will help in punching diff...
Pakistan is primarily an agriculture country with the capability of producing wheat, cotto...
We have to do experimental analysis on the basis of a new design incorporating 2 turbo cha...
This work provides with a lot of benefits like low power consumption, low operational cost...
We are building a strong, powerful information system backed by Artificial and Machine Lea...