UWB based Indoor Path Planning and Navigation using Reinforcement Learning.

2025-06-28 16:36:33 - Adil Khan

Project Title

Project Area of Specialization Artificial IntelligenceProject Summary

Reinforcement learning is the technique, which allows us to learn from the environment by interacting with them. Learning from interaction come from our natural behaviour, and we will implement it over the robot. The robot/agent will influence the environment through his actions and in return, the environment will give us data that data will help us to train our Neural Network Model. Reinforcement learning can be demystified by considering the agent (moving vehicle) receives an environment state which we denoted by S0 where zero stands for at t=0, based on the observation the agent chooses the action A0 after taking this action environment will give us a new state S1 and give some reward R1 to the agent, agent then take an action A1 at time step2, process continues where the environment passes the reward and state, agent responds with an action and so on.

UWB based Indoor Path Planning and Navigation using Reinforcement Learning. _1582920676.png

Reinforcement learning Model Block Diagram[1]

Reinforcement learning is based on the reward-based hypothesis, and the goal of the agent is to maximize expected cumulative rewards we will deploy Reinforcement learning over the vehicle, which is moving in the synchronized environment of our indoor positioning system made with UWB ultra wideband tranceivers.

Three stationary Ultra-Wide Band (UWB) transmitters mounted on the walls of the Room which act as anchors, the sensor will emit UWB pulses simultaneously, as we can only read single pulse at a time, so to allot a fixed time interval to each pulse we have to multiplex using Time Difference of Arrival (TDMA). The moving autonomous vehicle will be roaming in the indoor environment here vehicle will also have a mounted UWB receiver. The receiver will calculate the time delay in receiving the three pulses and knowing its positions; the receiver will estimate its position using trilateration principle. From the trilateration principle, we will find the xy co-ordinate and direction and send this data to the PC for training the Reinforcemenat learning model.

[1] Google Deep mind Reasech paper.

Project Objectives

The objectives of our project include:

Reinforcement Learning for navigation of the vehicle
Designing of an indoor Positioning System using Trilateration
Localization of Autonomous Vehicle within Indoor Environment
Data Acquisition from vehicles.

Project Implementation Method

The aim is to apply Reinforcement learning over a vehicle moving in a synchronized environment. The synchronized environment will be created with the help of three stationary Ultra-Wide Band (UWB) transmitters mounted on the walls of the Room, which act as anchors. The anchors will emit UWB pulses simultaneously. The moving autonomous vehicle will be roaming in the indoor environment here vehicle will also have a mounted UWB receiver. The receiver will calculate the time delay in receiving the three pulses and knowing its positions; the receiver will estimate its position using trilateration principle. From the trilateration principle, we will find the xy co-ordinate and direction, which is then send to PC. Reinforcement learning model will be training with the help of data provided by the vehicle, and the result will be in the form of the shortest possible path towards the destination.

UWB based Indoor Path Planning and Navigation using Reinforcement Learning. _1582920677.png

Block Diagram of Project

Benefits of the Project

The accurate self-localization and path planning of a vehicle by interacting with the environment will benefits self-driving car for indoor environment, apart from this we can also benefits the society implementing this project with some amendment in several aspects of daily life including, customer service in malls, autonomous wheelchairs, virtual reality games, asset tracking and rescue operations etc.

Technical Details of Final Deliverable

The final deliverable of a project will contain software hardware integrated system.

Indoor Positioning System (H/w)
Reinforcement learning Model (S/W)
Autonomous Vehicle (H/W)

Final Deliverable of the Project HW/SW integrated systemType of Industry Manufacturing , Transportation , Others , Security Technologies Artificial Intelligence(AI), RoboticsSustainable Development Goals Industry, Innovation and InfrastructureRequired Resources

Item Name	Type	No. of Units	Per Unit Cost (in Rs)	Total (in Rs)
			Total in (Rs)	61400
Arduino Mega	Equipment	1	1000	1000
Arduino pro mini	Equipment	4	500	2000
vehicle case	Equipment	1	5000	5000
Magnetometer	Equipment	2	500	1000
UWB1000	Equipment	4	8000	32000
Battery	Equipment	1	5000	5000
Battery charger	Equipment	1	3000	3000
fdi chip for programming	Equipment	4	600	2400
Miscellaneous	Miscellaneous	1	10000	10000

UWB based Indoor Path Planning and Navigation using Reinforcement Learning.

More Posts