8 hours of instruction
Learn how to accelerate and optimize existing C/C++ CPU-only applications to leverage the\npower of GPUs using the most essential CUDA techniques and the Nsight Systems profiler. You’ll learn how to write code, configure code parallelization with\nCUDA, optimize memory migration between the CPU and GPU accelerator, and implement the workflow that\nyou’ve learned on a new task—accelerating a fully functional, but CPU-only, particle simulator for observable\nmassive performance gains
OBJECTIVES
- Write code to be executed by a GPU accelerator
- Expose and express data and instruction- level parallelism in C/C++ applications using CUDA
- Utilize CUDA-managed memory and optimize memory migration using asynchronous prefetching
- Utilize concurrent streams for instruction-level parallelism
- Write GPU-accelerated CUDA C/C++ applications
PREREQUISITES
None
SYLLABUS & TOPICS COVERED
- Introduction
- Meet the instructor and create an account
- Accelerating Applications
- Write, compile, and run GPU code.
- Control parallel thread hierarchy.
- Allocate and free memory for the GPU
- Managing Accelerated Application Memory
- Profile CUDA code with the command line profiler.
- Go deep on unified memory.
- Optimize unified memory management.
- Asynchronous Streaming And Visual Profiling
- Profile CUDA code with the NVIDIA Visual Profiler.
- Use concurrent CUDA streams.
- Final Review
- Review key learnings and wrap up questions.
- Complete the assessment to earn a certificate.
- Take the workshop survey.
SOFTWARE REQUIREMENTS
Each participant will be provided with dedicated access to a fully configured, GPU-accelerated workstation in the cloud.
Login
Accessing this course requires a login. Please enter your credentials below!