COE 506: GPU Programming and Architecture

Course Description:

Basics of conventional CPU architectures, their extensions for single instruction multiple data processing (SIMD) generalization of single instruction multiple thread processing (SIMT) in modern GPUs. GPU architecture basics in terms of functional units. CUDA programming model. Architecture specific details (like memory access coalescing, shared memory usage, GPU thread scheduling) that effect program performance. OpenCL/OpenACC which can be used for programming both CPUs and GPUs in a generic manner. Different architecture-aware optimization techniques relevant to both CUDA and OpenCL. Application development examples in well-known GPU computing scenarios

Lecture Slides:

  • Introduction to GPUs
  • Introduction to CUDA C
  • CUDA Parallelism Model
  • Memory and Data Locality
  • Performance Considerations
  • Parallel Patterns
  • Memory and Data Locality
  • Dynamic Parallelism
  • Multi-GPU
  • Introduction to OpenACC