n this age of modern technologies, demand for processing is continuously growing. This growing demand for processing in various domains is being satisfied mainly by growing numbers of cores and heterogeneous accelerators. Various accelerators are available in market e.g. FPGA, DSP and GPU. GPU is on
Efficient Mapping of Application using CPU and GPU Platform
n this age of modern technologies, demand for processing is continuously growing. This growing demand for processing in various domains is being satisfied mainly by growing numbers of cores and heterogeneous accelerators. Various accelerators are available in market e.g. FPGA, DSP and GPU. GPU is one of the famous accelerators available in the market. GPUs have many simple cores that can be utilized in parallel in Single Instruction Multiple Thread (SIMT) manner to achieve the efficiency for the required application.
Parallelizing applications and efficiently mapping these applications on accelerator-based architectures, such as GPU, is not trivial. This is the focus of this project. For the sake of demonstration, the application we are using as a case study for our project is Canny Edge Detection, which is a well known edge detection technique. To port this application to GPU, we use CUDA toolkit. We start by profiling the application to figure the compute intensive parts of applications. We map these compute intensive parts of application on the GPU, whereas, the remaining parts of the application are being processed on the CPU.
The very main objectives of this project is to find the most compute intensive functions that are taking the most time for execution of the application that is being used for this specific purpose. Then the next task is to improve the performance of that application using the GPU as an accelerator. Mapping of the compute intensive function on the GPU is the most important part of this project. For the purpose of mapping CUDA Language is to be learned and implemented.
To implement all of the objects we need to setup the development platform containg the CPU and GPU. For this project we are using Linux (Ubuntu) as our Operating platform, Intel Core i5 3570 as CPU, 8GB DDR3 RAM, MSI GTX 960 2GB as our GPU and the compatible motherboard. GTX 960 have 1024 CUDA core that work parallely to improve the performance. After the setup we need to use the application profiler to find out the most compute intensive function of the application. After we need to map that Compute intensive function on the GPU and then again to be send on the CPU to give the final results.
There are a lot of benefits of this specific project. It can be used publicly as well as in the industries alike. Publicly it can be used by general public, by Developers, Gamers, Graphic designers, Video editors and Business analysts etc. All of them can use this specific method to improve the processing of the tasks on their hands. While this type of project can also get used in a lot of the industries. Such industries are Artificial Intelligence, Robotics, Virtual Desktop Infrastructure, Image Processing, Audio and Video Signal processing, Bioinformatics, Control Engineering, Neural Networks, Deep Learning, Mathematics, Quantum Computing and Engineering Simulations. In Pakistan industries like NESCOM, FAST, NETSOL TECH, Teradata, SRG Pakistan, Medical Transcription Billing Co. HIT, POF and PMO can use these methods to improve their performance. These types of projects can get used in the control systems of the missles to improve the efficiency. This can also be used in Medical Imaging, Production Line Inspection, nevigations e.g. autonomous vehicles to improve their decision speed, Machine Vision and Automations. All High Performance Computers (HPCs) use GPUs to improve their performance. Super Computer use this type of methods to improve their performance using GPUs.
Following Deliverables are due till the last date of project:
1. Design Document: This is something that contains the abstract, concept, design, goals and objectives, research, methods and summary of the prototyping and testing.
2. Analysis of the Compute Intensive Function.
3. Performance of the application on GPU.
4. Difference between the Performances of the CPU and GPU.
5. Simulation/Demo
6. Final Slideshow Presentation
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Intel core i5 3570 | Equipment | 1 | 8300 | 8300 |
| 4 GB DDR3 RAM | Equipment | 2 | 3600 | 7200 |
| 500 GB SATA HDD | Equipment | 1 | 4200 | 4200 |
| 400W PSU | Equipment | 1 | 4500 | 4500 |
| MSI GTX 960 2GB Dual Fan | Equipment | 1 | 23100 | 23100 |
| Motherbboard 2abf | Equipment | 1 | 4000 | 4000 |
| Logitech Wireless Keyboard and Mouse | Equipment | 1 | 3500 | 3500 |
| Dell LCD | Equipment | 1 | 3000 | 3000 |
| Miscellaneous | Miscellaneous | 1 | 4000 | 4000 |
| Total in (Rs) | 61800 |
Augmented reality (AR) takes the objects of the real world and ?augment? them i.e., creati...
Pakistan is a poor country, with a per capita gross domestic product (GDP) of US$1,909 in...
To design and analyze an affordable and portable clip-on power attachment that can functio...
Project Summary With the exponential increase in human population meeting the food demand...
Here we propose an advanced Comment Sentiment Analysis system that detects hidden sentimen...