# COE 308 – Computer Architecture Syllabus for Term 091 – Fall 2009

Computer Engineering Department College of Computer Sciences & Engineering King Fahd University of Petroleum & Minerals

**Professor:** Muhamed Mudawar, Room 22/328, Phone 4642

**Office Hours:** SMW 11 am - 1 pm.

Course URL: <a href="http://faculty.kfupm.edu.sa/coe/mudawar/coe308/">http://faculty.kfupm.edu.sa/coe/mudawar/coe308/</a>

Email: <u>mudawar@kfupm.edu.sa</u>

#### **Catalog Description**

Memory hierarchy and cache memory. Integer and floating point arithmetic. Instruction and arithmetic pipelining, superscalar architecture. Reduced Instruction Set Computers. Parallel architectures and interconnection networks.

Prerequisites: COE 205.

#### **Textbook**

David A. Patterson and John L. Hennessy, *Computer Organization and Design: The Hardware / Software Interface*, Third Edition, Morgan Kauffmann Publishers, 2005. ISBN 1-55860-604-1.

#### **Course Objectives**

- In-depth understanding of the inner-workings of modern computer systems, their evolution, and tradeoffs present at the hardware-software interface.
- Understanding the design process of a modern computer system. This includes the design of the processor datapath and control, the memory system, and I/O subsystem.

### **Grading Policy**

| Homework and Quizzes | 10% |
|----------------------|-----|
| Projects             | 25% |
| Major I Exam         | 20% |
| Major II Exam        | 20% |
| Final Exam           | 25% |

- Homework should be submitted at the beginning of class time in the specified due date. Late homework is not accepted, especially if the solution is posted online.
- Late projects are accepted, but will be penalized 5% for each late day, up to a maximum of five late days.

#### **Software Tools used in Mini-Projects**

- MARS simulator: runs MIPS-32 assembly language programs.
- LogiSim simulator: educational tool for designing and simulating CPUs.

## **Course Topics and Lecture Breakdown by Week**

| Introduction to computer architecture, ISA versus organization, components, abstraction, technology improvements, chip manufacturing process.  Instruction set design, RISC design principles, MIPS registers, instruction formats, arithmetic instructions, immediate operands, bit manipulation, load and store instructions, byte ordering, addressing modes, flow control instructions, pseudo-instructions, procedures and runtime stack, call and return, MIPS register conventions, alternative IA-32 architecture.  CPU performance and metrics, CPI, performance equation, MIPS as a metric, Amdahl's law, benchmarks and performance of recent Intel processors.  Integer multiplication, integer division, floating point representation, IEEE 754 standard, normalized and denormalized numbers, zero, infinity, NaN, FP comparison, FP addition, FP multiplication, rounding and accurate arithmetic, FP instructions in MIPS.  Designing a processor, register transfer logic, datapath components, clocking methodology, single-cycle datapath, main control signals, ALU control, single-cycle delay, multi-cycle instruction execution, Multi-cycle versus single-cycle execution.  Pipelining versus serial execution, MIPS 5-stage pipeline, pipelined datapath, pipelined control, pipeline performance.  Pipeline hazards, structural hazards, data hazards, stalling pipeline, forwarding, load delay, compiler scheduling, hazard detection, stall and forwarding unit, control hazards, branch delay, dynamic branch prediction, branch target and prediction buffer.  Chapter 1  Chapter 1  Sections 2.1 – 2.9  Sections 2.1 – 2.9  Sections 3.2 – 3.3  Appendix A.9 – A.10  Sections 3.2 – 3.3  Appendix A.9 – A.10  Sections 3.4 – 3.6  Sections 3.4 – 3.6  Sections 3.4 – 3.6  Sections 5.1 – 5.5                                                                                                                                                                                                                                                                                                  | Week  | Course Topics                                                                                                                                                                                                                                                             | Reading                                          |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------|
| registers, instruction formats, arithmetic instructions, immediate operands, bit manipulation, load and store instructions, byte ordering, addressing modes, flow control instructions, pseudo-instructions, procedures and runtime stack, call and return, MIPS register conventions, alternative IA-32 architecture.  CPU performance and metrics, CPI, performance equation, MIPS as a metric, Amdahl's law, benchmarks and performance of recent Intel processors.  Integer multiplication, integer division, floating point representation, IEEE 754 standard, normalized and denormalized numbers, zero, infinity, NaN, FP comparison, FP addition, FP multiplication, rounding and accurate arithmetic, FP instructions in MIPS.  Designing a processor, register transfer logic, datapath components, clocking methodology, single-cycle datapath, main control signals, ALU control, single-cycle delay, multi-cycle instruction execution, Multicycle versus single-cycle execution.  Pipelining versus serial execution, MIPS 5-stage pipeline, pipelined datapath, pipelined control, pipeline performance.  Pipeline hazards, structural hazards, data hazards, stalling pipeline, forwarding, load delay, compiler scheduling, hazard detection, stall and forwarding unit, control hazards, branch delay, dynamic branch prediction, branch target and prediction buffer.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 1     | organization, components, abstraction, technology                                                                                                                                                                                                                         | Chapter 1                                        |
| 5 equation, MIPS as a metric, Amdahl's law, benchmarks and performance of recent Intel processors.  Integer multiplication, integer division, floating point representation, IEEE 754 standard, normalized and denormalized numbers, zero, infinity, NaN, FP comparison, FP addition, FP multiplication, rounding and accurate arithmetic, FP instructions in MIPS.  Designing a processor, register transfer logic, datapath components, clocking methodology, single-cycle datapath, main control signals, ALU control, single-cycle delay, multi-cycle instruction execution, Multi-cycle versus single-cycle execution.  Pipelining versus serial execution, MIPS 5-stage pipeline, pipelined datapath, pipelined control, pipeline performance.  Pipeline hazards, structural hazards, data hazards, stalling pipeline, forwarding, load delay, compiler scheduling, hazard detection, stall and forwarding unit, control hazards, branch delay, dynamic branch prediction, branch target and prediction buffer.  Chapter 4  Sections 3.4 – 3.6  Sections 3.4 – 3.6  Sections 3.4 – 3.6  Sections 5.1 – 5.5  Sections 5.1 – 5.5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 2-4   | registers, instruction formats, arithmetic instructions, immediate operands, bit manipulation, load and store instructions, byte ordering, addressing modes, flow control instructions, pseudo-instructions, procedures and runtime stack, call and return, MIPS register | Sections 2.13, 2.15 – 2.18<br>Sections 3.2 – 3.3 |
| representation, IEEE 754 standard, normalized and denormalized numbers, zero, infinity, NaN, FP comparison, FP addition, FP multiplication, rounding and accurate arithmetic, FP instructions in MIPS.  Designing a processor, register transfer logic, datapath components, clocking methodology, single-cycle datapath, main control signals, ALU control, single-cycle delay, multi-cycle instruction execution, Multi-cycle versus single-cycle execution.  Pipelining versus serial execution, MIPS 5-stage pipeline, pipelined datapath, pipelined control, pipeline performance.  Pipeline hazards, structural hazards, data hazards, stalling pipeline, forwarding, load delay, compiler scheduling, hazard detection, stall and forwarding unit, control hazards, branch delay, dynamic branch prediction, branch target and prediction buffer.  Sections 3.4 – 3.6 Sect | 5     | equation, MIPS as a metric, Amdahl's law, benchmarks                                                                                                                                                                                                                      | Chapter 4                                        |
| components, clocking methodology, single-cycle datapath, main control signals, ALU control, single- cycle delay, multi-cycle instruction execution, Multi- cycle versus single-cycle execution.  Pipelining versus serial execution, MIPS 5-stage pipeline, pipelined datapath, pipelined control, pipeline performance.  Pipeline hazards, structural hazards, data hazards, stalling pipeline, forwarding, load delay, compiler scheduling, hazard detection, stall and forwarding unit, control hazards, branch delay, dynamic branch prediction, branch target and prediction buffer.  Sections 5.1 – 5.5  Sections 6.1 – 6.3  Sections 6.4 – 6.6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 6-7   | representation, IEEE 754 standard, normalized and denormalized numbers, zero, infinity, NaN, FP comparison, FP addition, FP multiplication, rounding                                                                                                                      |                                                  |
| pipeline, pipelined datapath, pipelined control, pipeline performance.  Pipeline hazards, structural hazards, data hazards, stalling pipeline, forwarding, load delay, compiler scheduling, hazard detection, stall and forwarding unit, control hazards, branch delay, dynamic branch prediction, branch target and prediction buffer.  Sections 6.1 – 6.3  Sections 6.4 – 6.6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 8-9   | components, clocking methodology, single-cycle datapath, main control signals, ALU control, single-cycle delay, multi-cycle instruction execution, Multi-                                                                                                                 | Sections 5.1 – 5.5                               |
| stalling pipeline, forwarding, load delay, compiler scheduling, hazard detection, stall and forwarding unit, control hazards, branch delay, dynamic branch prediction, branch target and prediction buffer.  Sections 6.4 – 6.6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 10    | pipeline, pipelined datapath, pipelined control, pipeline                                                                                                                                                                                                                 | Sections 6.1 – 6.3                               |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 11    | stalling pipeline, forwarding, load delay, compiler<br>scheduling, hazard detection, stall and forwarding unit,<br>control hazards, branch delay, dynamic branch                                                                                                          | Sections 6.4 – 6.6                               |
| Cache memory design, locality of reference, memory hierarchy, DRAM and SRAM, direct-mapped, fully-associative, and set-associative caches, handling cache miss, write policy, write buffer, replacement policy, cache performance, CPI with memory stall cycles, AMAT, two-level caches and their performance, main memory organization and performance.  Sections 7.1 – 7.3  Sections 7.5 – 7.6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 12-13 | associative, and set-associative caches, handling cache miss, write policy, write buffer, replacement policy, cache performance, CPI with memory stall cycles, AMAT, two-level caches and their performance, main                                                         |                                                  |
| 14 Introduction to Parallel Architectures                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 14    | Introduction to Parallel Architectures                                                                                                                                                                                                                                    |                                                  |