Computer Architecture  (COE 501)

TEACHING MATERIAL

  1. Computing System Fundamentals/Trends, Review of Performance Evaluation, and ISA Design (Fourth Edition: Chapter 1, Appendix B,   Third Edition: Chapters 1, 2)  . Reading material: 

    1. Designing for Power: Intel Leadership in Power Efficient Silicon and System Design, www.intel.com/technology.

    2. Practical SIMD Vectorization Techniques for Intel® Xeon Phi™ Coprocessors

    3. General-Purpose Graphics Processing Units in Service-Oriented Architectures

    4. GPU-Accelerated Scalable Solver for Banded Linear Systems

    5. Exploration of Automatic Optimization for eUDA Programming

  2. Review of Instruction Pipelining (In  Appendix A)

  3. Exploiting Instruction level Parallelism (Fourth Edition: Chapter 2.1, 2.2,   Third Edition: Chapter 3.1, 4.1)

  4. Dynamic Hardware-Based Instruction Pipeline Scheduling (Fourth Edition: Appendix A.7, Chapter 2.4, 2.5,   Third Edition: Appendix A.8, Chapter 3.2, 3.3) A Tomasulo Simulator.

  5. Multiple Instruction Issue and scheduling, Superscalar dynamic execution, speculative execution, and Dual-Core Chip-Multiprocessor (CMP) Architectures (Fourth Edition: Chapter 2.6-2.8, Third Edition: Chapter 3.6, 3.7, 4.3)

  6. Reduction of Control Hazards (Branch) Stalls with Dynamic Branch Prediction (Fourth Edition: Chapter 2.3, 2.9,   Third Edition: Chapter 3.4, 3.5, 4.2)

  7. Static Compiler Optimization Techniques and Vector Processing (Fourth Edition: Appendix G.1-3, Third Edition: Chapter 4.4)
    Vector Processing: Appendix G (3rd ed.), Appendix F (4th ed.)

  8. The Memory Hierarchy & Cache (Fourth Edition: Chapter 5.1, Appendix C.1-C.3   Third Edition: Chapter 5.1-5.4)

  9. Input/Output & System Performance Issues (Fourth Edition: Chapter 6.1, 6.2, 6.4, 6.5   Third Edition: Chapter 7.1-7.3, 7.7, 7.8)

  10. Virtual Memory

  11. Multiprocessors

    1. Cache Coherency     ........ Getting Started with OpenMPWork-Sharing with OpenMPAdvanced OpenMP

    2. Directory Coherency ........Cluster Computing and MPI

    3. Multiprocessor Synchronization

  12. Some CUDA Programming Rules 

  13. Example of data parallel programming using CUDA: CUDA-lite paper  and Program Analysis

  14. Review

GRADING

 

 

Reference on  and Parallel Processing

REFERENCE PRESENTATIONS

  1. Introduction to computer architecture

  2. Performance

  3. Instruction set architecture

  4. Instruction set examples

  5. Pipelining and hazards

  6. MIPS R4000 and ILP

  7. Superscalar and VLIW

  8. Compiler and hardware support for ILP

  9. Memory Hierarchy and Cache Design

  10. Reducing Cache Misses

  11. Reduce Miss Penalty and Hit Time

  12. Main Memory

  13. Virtual Memory and the Alpha 21064 Memory Hierarchy

  14. Networks and Interconnect

  15. Interconnection Networks

 

REVIEWING MATERIAL FROM UNDERGRADUATE LEVEL

  1. Introduction to Computer Architecture
  2. Performance of Computers
  3. Tutorial on SPIM Processor Simulator
  4. MIPS ISA I
  5. MIPS ISA II
  6. MIPS ISA III
  7. MIPS ISA IV
  8. Computer Arithmetics I (Signed and unsigned representation)
  9. Computer Arithmetics II (Integer multiply, divide, and floating-point)
  10. Introduction to DataPath
  11. Single Cycle DataPath
  12. Multi-Cycle DataPath
  13. Instruction Pipelining I
  14. Instruction Pipelining II
  15. Instruction Pipelining III
  16. Memory System I
  17. Memory System II
  18. Bus systems
  19. I/O system
  20. Multiprocessors    

 

COE 301 Course material