predict student academic performance using machine learning

Computers do not learn as well as humans do, but many machine-learning algorithms have been found that are effective for some types of learning tasks. They are especially useful in poorly understood domains where humans might not have the knowledge needed to develop effective knowledge engin

2025-06-28 16:34:34 - Adil Khan

Project Title

Project Area of Specialization Artificial IntelligenceProject Summary

Computers do not learn as well as humans do, but many machine-learning algorithms have been found that are effective for some types of learning tasks.

They are especially useful in poorly understood domains where humans might not have the knowledge needed to develop effective knowledge engineering algorithms.

Generally, Machine Learning (ML) explores algorithms that reason from externally supplied instances (input set) to produce general hypotheses, which will make predictions about future instances.

The externally supplied instances are usually referred to as training set. To induce a hypothesis from a given training set, a learning system needs to make assumptions (biases) about the hypothesis to be learned.

A learning system without any assumption cannot generate a useful hypothesis since the number of hypotheses that are consistent with the training set is usually huge.

Since every inductive learning algorithm uses some biases, it behaves well in some domains where its biases are appropriate while it performs poorly in other domains.

The ability of prediction of a student’s performance could be useful in a great number of different ways associated with university-level.

Students’ key demographic characteristics and their marks in a few written assignments can constitute the training set for a supervised machine learning algorithm.

The learning algorithm could then be able to predict the performance of new students thus becoming a useful tool for identifying predicted poor performers.

Project Objectives

To predict the student academic performance.
To analyze the predictor (student performance) accuracy.

To generalize and compare the predicted results by related classifier (i.e Naïve Bayes, Support Vector Machine, K-NN, Logistic regression and Decision Tree (C4.5)

This tool is useful for educational institutes and related department and companies in related cases.

Project Implementation Method

The ability to monitor the progress of students’ academic performance is a critical issue to the academic community of higher learning. A system for analyzing student’s results based on cluster analysis and uses standard statistical algorithms to arrange their scores data according to the level of their performance is described. So keeping the above mentioned methodology, I will follow the following steps to get the desired results:

Data pre-processing:

Data Cleaning
Handle Missing values
Data Visualization
Data Splitting.
Applying Propose Algorithms for training

I can apply the proposed algorithm on selected data set

Navies Bayes (NB)

Support Vector Machine (SVM)

K-nearest neighbors (KNN)

Logistics Regression (LR)
Decision Tree (C4.5).
Comparison of Predicted Results
After applying above mentioned algorithm, I can test the proposed algorithm by comparing their results.

Benefits of the Project

Academic achievement is being a big concern for academic institution all over the world.

The wide use of LMS generates large amount of data about teaching and learning interactions.

This data contains hidden knowledge that could be used to enhance the academic acheivement of students.

The performance of student’s predictive model is evaluated by set of classifiers, namely; Naïve Bayesian,knn,logistic regression,support vector machine and Decision tree.

In addition, we applied ensemble methods to improve the performance of these classifiers.

The obtained results reveal that there is a strong relationship between learner’s behaviors and their academic achievement.

The visited resources feature is the most effective behavioral feature on students’ performance model. In our future work, we will focus more on analyzing this kind of feature.

After completing the training process, the predictive model is tested using unlabeled newcomer students, the achieved accuracy is more than 80%.

This result proveshow realistic the predictive model is. Lastly, this model can help educators to understand learners, identify weak learners, to improve learning process and trimming down academic failure rates.

It also can help the administrators to improve the learning system outcomes.

Technical Details of Final Deliverable

front end and back end of our project.

front end based on Graphical user interface.

In backend we train the dataset and apply algorithms on the training dataset .

algorithms we used is:

navie bayes

support vector machine

k nearest neighbours

decision tree

logistic regression

Python programing language
Anaconda Jupyter Notebook
Python libraries (“numpy, os, Matplotlib, Pickle, Pandas,sklearn”)

Final Deliverable of the Project Software SystemCore Industry ITOther IndustriesCore Technology Artificial Intelligence(AI)Other TechnologiesSustainable Development Goals Quality EducationRequired Resources

Item Name	Type	No. of Units	Per Unit Cost (in Rs)	Total (in Rs)
			Total in (Rs)	20000
printing	Miscellaneous	5	2000	10000
smart devices	Equipment	1	10000	10000

predict student academic performance using machine learning

More Posts