Environmental Adaptation for the Urban Scene Understanding of Autonomous Vehicles

2025-06-28 16:27:05 - Adil Khan

Project Title

Project Area of Specialization Artificial IntelligenceProject Summary

With the advent of new technologies and improved resources, there has been extensive development in the domain of autonomous vehicles. Autonomous driving is being researched with the expectation that it will alleviate the gruesome externalities associated with traffic, including traffic congestion and traffic accidents. It should also be noted that these autonomous vehicles are a widely researched topic because of the complexities posed by the intricate and complex traffic, as well as the expectation for it to mimic human behavior including ethical, strategical, and traffic management.

So, we want to work in the perception domain of autonomous vehicles. In summary, we want to do the following

Formalize a composite dataset by amalgamating different widely known datasets
Train our model using Deep Learning Techniques like CNN and its variants, e.t.c.
Test our model using a dataset of a previously unknown geographical area
Calculate and evaluate the performance

Project Objectives

Our Objectives are as follows: better autonomous driving, urban scene understanding, and environmental adaptation and generalization.

A) Autonomous Driving

The self-driving pipeline can be divided into four basic components: perception, localization, planning, and control. Autonomous driving cars are driverless cars that don’t need human intervention for their mobility. One of the most challenging technical aspects of autonomous vehicles is the detection of obstacles unequivocally and indubitably at high speeds and long distances. This task refers to the “perception” component. To mitigate this problem, the companies have released extensive sets of datasets that are finely labeled to assist in accelerating the development of autonomous vehicles.

B) Urban Scene Understanding

It is incredibly important for the autonomous vehicle to be perceptive to the environment. It has been observed that the Urban Scene Understanding has moved from a neglected area to a much more focused area of research in the recent years [2]. There are three hierarchies of Urban Scene Understanding: Object Detection, Semantic Segmentation, and Instance Segmentation.

Object Detection is the broadest category and refers to the method of identifying and correctly labeling all the objects present in the image frame. Semantic Segmentation refers to the process of linking each pixel in the given image to a particular class label. Whereas, in instance-based segmentation, we associate a class label to each pixel similar to semantic segmentation, except that it treats multiple objects of the same class as individual objects / separate entities. Object Detection offers less complexity but we also need less computational power, while Instance-based Segmentation provides us with high knowledge of surroundings but is also computationally huge. To make a decent tradeoff between the complexity and computational resources, we will be focusing on Semantic Segmentation to ensure that we have adequate details of the scenery and our computational costs don’t exceed too much.

C) Environmental Adaptation and Generalization

There has been a prevalence in Advanced Driver’s Assistance Systems (ADAS) and a surge in interest in autonomous vehicles. There has been tremendous progress in the Semantic Segmentation domain, however, the models and algorithms proposed cannot generalize well to a different dataset or different locations. This presents an issue known as covariate shift or selection bias in which the models are more biased towards the training dataset, so they don’t generalize well when it comes to real-world data. In adaptation, we use the dataset from different test environments to train a robust algorithm that has seen various test environments. In generalization, we expect the data to perform satisfactorily to previously unseen data that was not available during the training process.

Project Implementation Method

In our project, our main emphasis is on Environmental Adaptation for the Urban Scene Understanding of Autonomous Vehicles. So, we want our model to properly classify and adapt, and then properly generalize on an unknown environment.

(A) Informal Statement: We want to propose such an algorithm that assists autonomous vehicles for a hassle-free drive.

(B) Formal Statement:

Task (T): After supervised training of the model on a particular dataset, the algorithm can environmentally adapt to classify another unknown domain.
Experience (E): The algorithm trains using a generalized, composite dataset comprising of various datasets.
Performance (P): We will make use of measures such as Pixel Accuracy, Intersection Over Union (Jaccard Index), and Dice Coefficient.

(C) Domains of Our Project:

Reconstruction and regeneration of the MSeg dataset that comprises of the following datasets: KITTI, CityScapes, COCO + COCO STUFF, ADE20K, MAPILLARY datasets, and PASCAL VOC, PASCAL CONTEXT, CAMVID , WILDDASH, KITTI, SCANNET-20.
Relabeling of the composite data using mseg-mturk labeling scripts.
Training our specialized Graph-Adversarial Network on this.
Testing it on the state-of-the-art unknown dataset.
Application of a few-shot learning technique.

Benefits of the Project

As robots evolve in unknown environments, they are prone to encountering specular obstacles. Autonomous cars create and maintain a map of their surroundings based on a variety of sensors situated in different parts of the vehicle. Radar sensors monitor the position of nearby vehicles while the video cameras detect traffic lights, read road signs, track other vehicles, and look for pedestrians. Urban Scene Understanding is when autonomous vehicles make use of this information by classifying the objects around them into different categories, so it helps them avoid obstacles and understand the traffic more.

There has been a lot of research in this domain, however, the proposed models and techniques fail to adapt and generalize to the real world. To combat this, we will do the reconstruction of Mseg dataset that recently became available. We also want to utilize this newer dataset, which has, so far, no publications. This is the Audi Autonomous Driving Dataset(A2D2). We wish to achieve a model that can achieve domain adaptation and generalization without the need for manually relabeling the testing dataset.

Technical Details of Final Deliverable

Here’s the list of deliverables that we aim to submit by the end of this project:

1. Initial and Final Reports

2. Initial, Mid, and Final Presentations

3. A generalized deep-learning model

4. Exploratory Data Analysis of the important datasets

5. Results of our technique

6. Comparison with other approaches

7. Thesis

Final Deliverable of the Project Software SystemCore Industry ITOther Industries Transportation Core Technology Artificial Intelligence(AI)Other TechnologiesSustainable Development Goals Industry, Innovation and InfrastructureRequired Resources

Item Name	Type	No. of Units	Per Unit Cost (in Rs)	Total (in Rs)
			Total in (Rs)	69083
AWS instances of a1.metal hourly rate	Equipment	973	71	69083

Environmental Adaptation for the Urban Scene Understanding of Autonomous Vehicles

More Posts