Object Detection from three dimensional point cloud using deep learning

2025-06-28 16:28:41 - Adil Khan

Project Title

Project Area of Specialization Artificial IntelligenceProject Summary

In this project two models shall be acquired, one from the 3D camera and the other from the laser scanner. For the 3D camera, a simple webcam shall be used to capture the entire environment and the algorithms such as Structure from motion (SFM) and Speeded-Up Robust Features (SURF) shall be used to form a 3D model. The second model is made by using the 3D scanning technique however, instead of using a very expensive 3d scanner, a 2d scanner is being used to make the achievement of result more cost-effective. ICP Algorithm is to estimate the transformation to move one cloud so that it is aligned with the other one. Consider two point clouds, to align one point cloud on another, the ICP algorithm is used. Iterative Closest Point (ICP) algorithm transformation parameters of two-point sets are calculated through the relationship between the corresponding matching points of two-point sets to satisfy the given convergence precision, and finally, the translation and rotation parameters between the two points are obtained to complete the registration process. After building the 3D model by photogrammetry, 2D laser scanners, and fusion of both the models by ICP algorithm, all these three models along with Modelnet40 a publically available data set are classified using a convolutional neural network algorithm called Point Net, and their accuracy is compared that which model provides the most accurate results. In the end, a 3D object detector is established to extract an object from a 3D point cloud data.

Project Objectives

Our aim is to capture an environment using a 3D camera and a laser scanner and then convert them into a 3D model after applying algorithms. The specific objective of laser scanning with a 2d scanner is to make a 3D mesh with 2D scans to save the cost of the 3D scanner and improve the workflow. Classifying each 3D point cloud obtained by laser scanners, Photogrammetry, and fusion of both by ICP algorithm, comparing and analyzing their results, and achieving a robust 3D model which performs better than the state of the art. This shall help us to implement a system for achieving accuracy at the industrial level.

Project Implementation Method

Four Objects are to be taken a chair, book, bottle, and ball. First, a stereo-Camera is used to scan the four objects to create their respective point clouds. The Stereo-Camera is to be rotated 360 degrees in three respective coordinates to create clear point clouds. This yields one point cloud of each object then the A Laser Scanner is used to create the second point cloud of each object which is taken and can be visualized using ROS. Now, these two point clouds are to be merged using ICP (Iterative Closest Point) Algorithm, and finally, these are to be classified according to their quality, and then the four objects are to be detected individually in each point cloud.

Benefits of the Project

Binocular disparity (objects seen through our right and left eyes) helps us to develop a perspective of anything so to make it in use stereoscopic imaging could be used. 3D photography helps capture the frozen-in-time moments for stills in such a way that it seems real enough to touch. Using this in our project will make our work close to reality and in the future, this shall prove to be easier for industries to implement. To deter the cost of the expensive 3D scanner and make the work process easier and affordable for all industries in this niche, the work on prototypes would be made easier faster, and more cost-effective. Point cloud data is irregular in shape thus it is always needed to transform into a collection of images and 3D voxel grids, this conversion of 3D data causes certain issues and makes data unnecessarily voluminous thus to avoid these issues and make the 3D data response more robust a network is selected called Point Net which takes raw point clouds as input and provide the output which shows strong performance over state of the art.

Technical Details of Final Deliverable

This project, Structure from motion, is highly in demand in the automobile industry to maintain accuracy and to assure its quality in Unmanned Aerial Vehicles (UAV). 3D point cloud
classification has pronounced significance in photogrammetry, computer vision, and remote sensing. A valid and authentic classification of objects such as remote sensing and scene reconstruction relies on the results of point cloud classification where each point is allotted a relative semantic class label. Point clouds have vast applications thus classifying 3D point cloud data makes it cost-effective and time-saving. Thus, through registering the different three-dimension data of the target, the relative posture change of the target to the measuring device can be obtained. ICP algorithm is a kind of matching algorithm which is mostly used in three-dimension point cloud registration. It is based on the iterative optimization matrix. In each iteration, each point of the target point set is focused to find the nearest point of the reference point set.

Final Deliverable of the Project Software SystemCore Industry ManufacturingOther IndustriesCore Technology Artificial Intelligence(AI)Other TechnologiesSustainable Development Goals Decent Work and Economic Growth, Industry, Innovation and InfrastructureRequired Resources

Item Name	Type	No. of Units	Per Unit Cost (in Rs)	Total (in Rs)
			Total in (Rs)	80000
Intel stereo camera	Equipment	1	20000	20000
Laser scanner	Equipment	1	50000	50000
Accessories	Miscellaneous	1	10000	10000

Object Detection from three dimensional point cloud using deep learning

More Posts