Adil Khan 9 months ago
AdiKhanOfficial #FYP Ideas

MACHINE LEARNING BASED LUEKAEMIA CANCER PREDICTION SYSTEM USING PROTEIN SEQUENTIAL DATA

Leukemia cancer (type of blood cancer) is one of the major problem in health sciences nowadays, it is basically caused by the neoplastic proliferation of White blood cells (WBC), several studies and researches have been conducted to detected cancer using microscopic images of blood cells b

Project Title

MACHINE LEARNING BASED LUEKAEMIA CANCER PREDICTION SYSTEM USING PROTEIN SEQUENTIAL DATA

Project Area of Specialization

Artificial Intelligence

Project Summary

1. Summary:

Leukemia cancer (type of blood cancer) is one of the major problem in health sciences nowadays, it is basically caused by the neoplastic proliferation of White blood cells (WBC), several studies and researches have been conducted to detected cancer using microscopic images of blood cells but if we talk about Protein Sequential data this area is not widely researched as compare to other techniques. But the real problem is that we need to visit a hematologist to diagnose it, moreover there are only 10 hematologist specialist in KPK (until 24 November 2021). Acording to WHO and Global Cancer Observatory the mortality rate is highest in Asia, link is under

https://gco.iarc.fr/today/data/factsheets/cancers/36-Leukaemia-fact-sheet.pdf

and generally it is detected at a stage where recovery becomes very difficult so we are developing an algorithm that will detect the cancer through Protein sequential data, after collecting data of leukemia cancer we will use this data-set and apply it to several machine learning algorithms such as SVM, RandomForest, XG boost, logistic regression then we will assess the accuracy of the each one, then the algorithm, which delivered maximum accuracy we will embed this algorithmin our system so it will classify weather the person is affected from cancer or not.

Project Objectives

Objectives

So as we discussed that the leukemia cancer is predicted at a stage where recovery chances are minimum so therefore we are proposing Machine Learning based technique to identify those genes which causes Leukemia cancer through Protein Sequences, so if we detect cancer early on then we can decrease the mortality rate exponentially. So in-case we are successful in implementing this project with high accuracy this will become a flagship project for health sciences and then we can also accommodate the outnumbering of hematologist (specialists).

Future Work:

We will embed our system with the PCR test machine in order to use our system in real time hospital patients to detect at a very cheap cost.

Project Implementation Method

Project Implementation

Environment & Language:

To implement Machine Learning algorithms we have two options:

Matlab / Octave.

Python.

So we have two options for the implementation of machine learning algorithms. But we are willing

to use Python as it has highly optimized libraries.

4.2 Libraries:

we have numerous libraries for implementing classification problem as:

1. Standard library.

2. Numpy

3. LIBSVM

4. Pandas.

5. Matplotlib

6. Sea born.

7. Scikit-learn

Data Set:

In machine learning projects one of the main and most important player is the data set if we have

data set and related algorithms we can solve a variety of problems through machine learning. We

will take data set for CML from Universal Resource of Protein (UniProtKb) in FASTA file format.

4.4 Implementation Issues and Challenges:

The most difficult issue / challenge, is based on selecting the correct input parameters and to find an optimal fit for the data because if we have enormous parameters then it will over-fit the data, if the data is over-fitted then the algorithm will give excellent result on the trained data set but the specimen or the data given outside the data-set may not have the correct output, if the numbers of parameter are few then the classification curve will be under-fit because the Algorithm will not have enough parameter to judge the correct output, so therefore we must need to achieve the optimal fit or appropriate for the algorithm.

Methods:

 algorithm.

Methods:

Benefits of the Project

Benifits:

this is a first time ever classification system that is bassed on protien sequences, that we can also detect the cancer which using the dataset of the invovvled genes.

This will increase the chance of survival rate of paitents abd we cna use this system in hospitals in real time.

Technical Details of Final Deliverable

Techincal Details:

Our final deliveable will a machine learning algorithm based app and in (FUTURE: we will be embedding this app with PCR test machine can't im[lement PCR test machine upto final presentation becuase it reqiures high level expertise and accuracy as it related to human life 

Final Deliverable of the Project

Software System

Core Industry

Medical

Other Industries

Health

Core Technology

Artificial Intelligence(AI)

Other Technologies

Sustainable Development Goals

Good Health and Well-Being for People

Required Resources

Item Name Type No. of Units Per Unit Cost (in Rs) Total (in Rs)
Matlab / Octave, Equipment100
Anaconda Equipment100
VS code Equipment100
Future Work ( PCR Machine ) Equipment17000070000
Total in (Rs) 70000
If you need this project, please contact me on contact@adikhanofficial.com
Design and Analysis of 50KgF thrust gas turbine engine

Modern transportation relies heavily on internal combustion engines, including gas turbine...

1675638330.png
Adil Khan
9 months ago
Designing of Alternator for Power Generation

Most people are spending most of their lifetime in walking. Walking is also known as ambul...

1675638330.png
Adil Khan
9 months ago
Urdu DBpedia in sentiment

DBpedia is an open,free and complete knowledge base constantly updated and exten...

1675638330.png
Adil Khan
9 months ago
Flash charged EV(electric vehicle)

Electric vehicles are the future with the advancing technology. Conventional battries...

1675638330.png
Adil Khan
9 months ago
Knock Knock

In present era the world has become the global village. Business is moving quickly towards...

1675638330.png
Adil Khan
9 months ago