Abdulaziz Tabbakh | GPU Programming and Architecture

Course Description:

Basics of conventional CPU architectures, their extensions for single instruction multiple data processing (SIMD) generalization of single instruction multiple thread processing (SIMT) in modern GPUs. GPU architecture basics in terms of functional units. CUDA programming model. Architecture specific details (like memory access coalescing, shared memory usage, GPU thread scheduling) that effect program performance. OpenCL/OpenACC which can be used for programming both CPUs and GPUs in a generic manner. Different architecture-aware optimization techniques relevant to both CUDA and OpenCL. Application development examples in well-known GPU computing scenarios

Lecture Slides:

Introduction to GPUs
Introduction to CUDA C
CUDA Parallelism Model
Memory and Data Locality
Performance Considerations
Parallel Patterns
Memory and Data Locality
Dynamic Parallelism
Multi-GPU
Introduction to OpenACC