Salat (Namaz) Monitoring System using YOLOv3
In the Muslim community, the prayer (i.e. Salat) is the second pillar of Islam, and it is the most essential and fundamental worshiping activity that believers have to perform five times a day. It consists of the classification of the activity of a person from the data collected from sensors
2025-06-28 16:34:52 - Adil Khan
Salat (Namaz) Monitoring System using YOLOv3
Project Area of Specialization Artificial IntelligenceProject SummaryIn the Muslim community, the prayer (i.e. Salat) is the second pillar of Islam, and it is the most essential and fundamental worshiping activity that believers have to perform five times a day.
It consists of the classification of the activity of a person from the data collected from sensors (e.g. accelerometer, camera, and laser scanner).
In this paper, we consider a particular human activity application with special interest to the Muslim community around the world, namely the recognition of postures of the Islamic prayer, also known as Salat.
Salat is the second pillar in Islam and is the most important worshipping activity that is repeated five times a day by any Muslim. Besides its spiritual value, it consists of a series of postures that must be executed in a predefined sequence as instructed by the Prophet Mohamed, peace be upon him.
Artificial Intelligence (AI) represents nowadays the hottest technology ever with a huge impact of the societies and services provided in different types of applications. One the main driving factors of artificial intelligence in the last decade is the emergence of deep learning in computer vision applications and more particularly with convolutional neural networks (CNNs).
Motion recognition has received significant attention in recent years in the area of computer vision since it has a wide range of potential application that can be developed.
Project Objectives- To collect or form dataset
- To recognize salah / namaz correct postures
- To alert the individual about his/her mistakes
Among the deep learning algorithms used in computer vision, YOLOv3 is the most attractive for the task of prayer posture recognition. Two facts are defending this choice. Firstly, YOLOv3 had proven its efficiency compared to other object detection algorithms. Secondly, it has an efficient inference time (up to 45 frames per second). This allows a real-time recognition of the prayer postures. In the following subsections, we present the architecture of the first version of YOLO (YOLOv1) and the different improvements implemented in YOLOv2 and YOLOv3.
YOLOv3
In April 2018, the YOLOv3 was introduced as an incremental improvement to the previous versions. The total architecture of YOLO v3 is detailed in Figure. Among the improvements made, we can note:
- The use of the multi-label classification. Instead of the mutual exclusive labeling in the previous versions, YOLOv3 uses a logistic classifier to estimate the likeliness of the object being of a specific label. The classification loss is changed to use for every label the binary cross-entropy loss instead of the general mean square loss used in the previous versions.
- The use of another bounding box prediction. During the training, the objectness score 1 in YOLOv3 is associated with the bounding box anchor that best overlaps the ground truth object. Besides, if the IoU (Intersection Over Union) between the bounding box anchor and the ground truth is less than a threshold (0.7 in the implementation), it is ignored. In the end, YOLOV3 associates for every ground truth object one bounding box anchor.
- The change of the output 3d tensor. In YOLOv3, the prediction is done for one grid cell at 3 different scales and then concludes the final bounding box from those scales. This was inspired by the feature pyramid networks. Hence, the new dimension of the output 3d sensor is then:
S × S × (3 ? (5 + C))
where: – S × S: corresponds to the number of grid cells.
– B: is omitted because only one bounding box anchor is kept at the end.
– C: corresponds to the number of targeted classes.
• The adoption of a new feature extractor (the darknet53). It has 53 layers and uses skip connection similarly to ResNet. It uses both 3×3 and 1×1 convolutions. It gave the state of the art accuracy but with better speed and fewer computations.
%20Monitoring%20System%20using%20YOLOv3%20_1639952999.png)
Yolov3 architecture
Benefits of the ProjectThe main reason for what we are making this project is that some people like young boys or others are hesitant in going to the mosques and learning the right postures of salat from the imams or there are not much resources available so this project will help those people by easily identifying the basic postures of salat(Namaz) by the help of our data set the user can easily recognize the right postures and perfect their salat.
Technical Details of Final DeliverableSalat Monitoring System in this project we will mainly be constrained some postures performed during salat i.e (Qayyam, Ruku & Sajdah). The scope of the project will encompass the said three postures. The data set which is collected is through common assumption of muslims performing namaz in thier daily life.
Final Deliverable of the Project Software SystemCore Industry ITOther IndustriesCore Technology Artificial Intelligence(AI)Other TechnologiesSustainable Development Goals Industry, Innovation and InfrastructureRequired Resources| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Total in (Rs) | 80000 | |||
| Report printing and binding | Miscellaneous | 6 | 700 | 4200 |
| Stationary Items etc. | Miscellaneous | 10 | 160 | 1600 |
| Panaflex , books | Miscellaneous | 6 | 700 | 4200 |
| A4tech Full HD 1080P Auto Focus Webcam - PK-930HA | Equipment | 2 | 7500 | 15000 |
| Canon EOS 1300D | Equipment | 1 | 46000 | 46000 |
| Tripod | Equipment | 1 | 4000 | 4000 |
| HDD | Equipment | 1 | 5000 | 5000 |