Adil Khan 9 months ago
AdiKhanOfficial #FYP Ideas

Classification of Network Data using Machine Learning

The ability to process, analyze, and evaluate network data and to identify their anomaly patterns is in response to realized increasing demands in various networking domains, such as corporations or academic networks. The challenge of developing a scalable, fault-tolerant and resilient monitoring sy

Project Title

Classification of Network Data using Machine Learning

Project Area of Specialization

Artificial Intelligence

Project Summary

The ability to process, analyze, and evaluate network data and to identify their anomaly patterns is in response to realized increasing demands in various networking domains, such as corporations or academic networks. The challenge of developing a scalable, fault-tolerant and resilient monitoring system that can handle data at a massive scale is nontrivial. We present a novel framework for network traffic anomaly detection using machine learning algorithms.

This project deals with anomaly detection on network data traffic with the aid of artificial intelligence and machine learning techniques.

However, it will not be a run-time model but it will be such an AI model that will be able to detect difference between a normal packet and an anomaly packet. This  can be implemented on run time but for that one has to implement some more specifications of run time, which is NOT a part of this project.

Project Objectives

1. Investigate fraud/Anomaly as it exists:

Anomaly detection alone or coupled with the prediction functionality can be an effective means to catch the fraud and discover strange activity in large and complex networks. It is crucial for banking security, medicine, marketing, natural sciences, and manufacturing industries which are dependent on the smooth and secure operations

2. Cheaper & Faster:

Anomaly detection has the potential to add significant business value. Big data has made it effortless, integrations with existing deliver mechanisms and advancements in various delivery models has made it easier to adopt, advances in machine learning and deep learning has made it cheaper, and it is better for decision makers to manage by exceptions and it empowers ecommerce businesses to respond faster than ever before.

3. Providing Additional Security:

The main objective of this project is to make a complete simulated environment in which we can detect any threats and then identify and block them. This will be completely based on AI and we be using real network data and its features to train the models and then will make a pipeline for reinforcement learning.

4. Pipeline for Continuous Learning:

A pipeline for continuous learning is essential as world changes and data patterns changes rapidly so continuous learning will be made in effect so real time data is updated continuously.

Project Implementation Method

First, we will use the best possible way to get data and all the requirements and start our software part. We will start by doing exploratory data analysis and creating a pipeline for our data which will insure that data is well cleaned and all the values are according to their data types and according to standards. Then after performing all the analysis we will perform SMOTE analysis if required and or if our data is imbalanced. Then we will work on creating our AI model we will try both techniques classification and clustering and if required we will do semi-supervised machine learning techniques to create our model. We will try different classification algorithms such as Neural Networks, Random Forest, Gradient Boost, etc. We can then use Voting technique to see which algorithm did the best.

Benefits of the Project

The concept of anomaly detection is of great significance and is a growing field of cyber security. Due to dynamic change of malware in network traffic data, traditional tools and techniques are failing to protect networks from attack penetration. More and more organizations have become vulnerable to Internet attacks/intrusions. All organizations big or small are spending money on security and buying devices having pre-built procedures and anti- images to detect anomalies but...if any zero-day attack comes then they are vulnerable and here comes AI. It will provide an additional security element which will not look at the data and matches the anti-images.....rand rather it will try to look at the outlier patterns and then make the call about that packet.

Technical Details of Final Deliverable

DATA COLLECTION:

First, we’ll collect our data. We gathered our data for our final year project from an open source.

Whose link is attached below

https://www.unsw.adfa.edu.au/unsw-canberra-cyber/cybersecurity/ADFA-NB15-Datasets/

EDA(Exploratory Data Analysis) FORMATION:

Our next step will be EDA formation

  • Cleaning (to clean our data according to our requirement)
  • Incorrect value analysis
  • Null value analysis
  • Data analysis (if the data is imbalance)
  • SMOTE analysis (if data is imbalanced)

FEATURE SELECTION:

Random Forest for feature engineering.

SELECTED FEATURE SPLITTING:

Train, test split with cross K-fold Validation method

SUPERVISED LEARNING:

Neural networks

Decision tree

XG Boost

Ada Boost

SVM

Gaussian Naive Bayes

VALIDATION WITH TEST DAT

A:

We’ll use accuracy, precision, score F1 and confusion matrix to measure the quality of predictions

Final Deliverable of the Project

Software System

Core Industry

IT

Other Industries

Core Technology

Artificial Intelligence(AI)

Other Technologies

Sustainable Development Goals

Required Resources

Item Name Type No. of Units Per Unit Cost (in Rs) Total (in Rs)
Arduino Equipment115001500
ESP-8266-01 microcontroller Equipment110001000
Batteries Equipment5150750
D1 Mini Datalogger Shield Equipment110001000
microSD card (16GB+) Equipment115001500
USB SD card reader Equipment1500500
Resistors Equipment1500500
Wires Equipment1500500
Board ( bread board or verro) Equipment1500500
Final box Equipment110001000
Type A to micro USB cable Equipment110001000
printing Miscellaneous 47503000
book binding Miscellaneous 45002000
Total in (Rs) 14750
If you need this project, please contact me on contact@adikhanofficial.com
COMET Communicate and Explore Through it

According to ?Rescue 1122?, there have been almost 7m rescue operations reported since 200...

1675638330.png
Adil Khan
9 months ago
Back in Control

Stroke has been regarded as the most common cause of disability and a leading cause of mor...

1675638330.png
Adil Khan
9 months ago
Design and Fabrication of a Roadside Vertical Axis Wind Turbine

The project involves designing an H-darrieus Vertical Axis Wind Turbine for use in roadsid...

1675638330.png
Adil Khan
9 months ago
Compiler Construction Lab Manual

Compiler Construction Lab Manual

1675638330.png
Adil Khan
6 years ago
SEO and keyword analysis tool + courses

People who are working in the SEO industry do use paid tools for the SEO analysis of their...

1675638330.png
Adil Khan
9 months ago