King Fahd University of Petroleum & Minerals
College of Computer Sciences and Engineering
Computer Engineering Department

COE 420 Parallel Computing

Parallel Processing Architectures (COE 502)


 Cluster Computing (ICS 446)

Course Syllabus

Course Description: 

Introduction to parallel processing architectures, sequential, parallel, pipelined, and dataflow architectures. Vectorization methods, optimization, and performance. Interconnection networks, routing, complexity, and performance. Small-scale, medium-scale, and large-scale multiprocessors. Data-parallel paradigm and techniques. Multithreaded architectures and programming. The students are expected to carry out research projects in related field of studies. (Pre-requisite: COE 301 Computer Organization or equivalent).  

Course Outline:

  1. The Need and Feasibility of Parallel Computing, Technology Trends, Microprocessor Performance Attributes, Goal of Parallel Computing. Computing Elements, Programming Models, Flynn's Classification, Multiprocessors Vs. Multicomputers. Current Trends In Parallel Architectures, Communication Architecture. (PCA Chapter 1.1, 1.2) (Chapter 1 and 2).
  2. Parallel Architectures Convergence: Communication Architecture, Communication Abstraction. Naming, Operations, Ordering, Replication. Communication Cost Model.(PCA Chapter 1.2, 1.3)
  3. Parallel Programs: Conditions of Parallelism. Asymptotic Notations for Algorithm Analysis, PRAM. Levels of Parallelism, Hardware Vs. Software Concurrency. Data Vs. Functional Parallelism. Amdahl’s Law, DOP, Concurrency Profile. Steps in Creating Parallel Programs: Decomposition, Assignment, Orchestration, Mapping. (PCA Chapter 2.1, 2.2)
  4. Parallelization of An Example Program: Ocean simulation Iterative equation solver (2D Grid). (PCA Chapter 2.3)
  5. Cluster Computing: Origins, Broad Issues in Heterogeneous Computing (HC). Message-Passing Programming. Overview of Message Passing Interface (MPI 1.2). (PP Chapter 2, Appendix A, MPI and HC References Below)
  6. Considerations in Parallel Program Creation Steps for Performance. (PCA Chapter 3)
  7. Basic Parallel Programming Techniques and Examples. Massively Parallel Computations: Pixel-based Image Processing. Divide-and-conquer Problem Partitioning: Parallel Bucket Sort, Numerical Integration, Gravitational N-Body Problem. Pipelined Computations: Addition, Insertion Sort, Solving Upper-triangular System of Linear Equations. Synchronous Iteration: Barriers, Iterative Solution of Linear Equations. Dynamic Load Balancing: Centralized, Distributed, Moore's Shortest Path Algorithm. (PP Chapters 3-7, 12)
  8. Network Properties and Requirements For Parallel Processing. Static Point-to-point Connection Network Topologies. Network Embeddings. Dynamic Connection Networks. (PP Chapter 1.3, PCA Chapter 10, handout)
  9. Parallel System Performance: Evaluation & Scalability. Workload Selection. Parallel Performance Metrics Revisited. Application/Workload Scaling Models of Parallel Computers. Parallel System Scalability.
    (PP Chapter 1, PCA Chapter 4, handout)
  10. The Cache Coherence Problem in Shared Memory Multiprocessors. Cache Coherence Approaches. Snoopy Bus-Snooping Cache Coherence Protocols: Write-invalidate: MSI, MESI, Write-Update: Dragon.
    (PCA Chapter 5, handout)
  11. Cache Coherence in Scalable Distributed Memory Machines: Hierarchical Snooping, Directory-based cache coherence. (PCA Chapter 8)

Note: PCA stands for the book on Parallel Computer Architecture and PP stands for the book on Parallel Programming.