COE 501 Assignments - Fall 2014
Computer Architecture

 

Muhamed F. Mudawar

mudawar@kfupm.edu.sa

Office: Building 22, Room 328, Phone: 4642

COE 501 Home | Syllabus | Lectures | Tools and Manuals

 

Problem Set 1: Due Monday, September 22, at 5 PM

Fundamentals of Quantitative Design and Analysis

Problem Set 2: Due Monday, October 20, at 5 PM

Instruction Set Principles

Problem Set 3: Due Monday, November 10, at 5 PM

Pipelining: Basic and Intermediate Concepts

Problem Set 4: Due Wednesday, December 3, at 5 PM

Caches and Virtual Memory

Problem Set 5

Complex Pipelining

Problem Set 6

Cache Coherence

 

Project: Due Thursday, December 25, at 12 noon

Select One of the projects listed below

Form your group, do your survey, decide on what will be implemented in the project and write a proposal document for your project.

The deadline to submit your proposal document is Wednesday, November 12, at 5 PM.

1. Using Gem5 to simulate a detailed processor

A dynamic execution processor pipeline consists of multiple functional units (such as integer ALU, Floating-point unit, Multiplier, and Divider) with an issue unit that issues multiple instructions to multiple functional units and resolves hazards.

Use the gem5 simulator to evaluate the performance of real benchmarks on different architectures (such as x86 and ARM). Use different CPU models (in-order versus out-of-order execution) with different functional units and cache memory hierarchies. Use different configurations to simulate many benchmarks. You many also study and modify the gem5 simulator to introduce new features.

Produce statistics about the program being simulated. Report the instruction count of different classes of instructions, the total number of clock cycles, the average CPI, the accuracy of the branch predictor, cache misses, as well as other related statistics.

Carry out a literature survey from the IEEE and ACM digital libraries, as well as the Internet about other related computer architecture simulators.

Write a paper (similar to a conference or journal paper) with a Title, Abstract, Introduction, Gem5 Simulator, Other Simulators, Methodology, Configurations, Benchmarks, Results, Conclusion, and References.

The paper should be submitted along with a PowerPoint presentation on Thursday, December 25 by 12 noon.

2. Using Gem5 to simulate a Multicore System

The idea is to use Gem5 to simulate a mutlicore system. Use the gem5 simulator to evaluate the performance of real parallel benchmarks that run on multiple cores.

The cores can be multithreaded. Each thread should have its own state, such as a copy of the program counter and the register file. The pipeline stages, functional units, and caches are shared by all threads.

Each core has an instruction cache (I-cache) and a data cache (D-cache), and possibly an L2 cache that can be configured with different parameters. Multiple cores can share either an L2 or an L3 cache. Different cache coherence protocols can be simulated.

Carry out a literature survey from the IEEE and ACM digital libraries, as well as the Internet about other related multicore system simulators.

Write a paper (similar to a conference or journal paper) with a Title, Abstract, Introduction, Gem5 Simulator, Other Simulators, Methodology, Configurations, Benchmarks, Results, Conclusion, and References.

The paper should be submitted along with a PowerPoint presentation on Thursday, December 25 by 12 noon.

3. Designing a Dual-Issue pipeline for a MIPS-like processor

Design and implement a 2-issue pipeline for a MIPS-like instruction set that includes integer ALU instructions, load, store, jump, branch, call, and return instructions. Two instructions can be fetched and completed each cycle. The core should support multiple functional units such as ALU, multiplier, and divider. The ALU is one stage in the pipeline. However, the multiplier should be pipelined and consists of multiple stages. Finally, the divider is not pipelined internally. Each divide instruction takes multiple cycles to compute.

As a bonus, you may also provide a floating-point unit that executes floating-point instructions. Floating-point instructions can be pipelined, except for floating-point divide, which is not pipelined and takes multiple cycles to compute.

The I-cache and D-cache can be implemented as simple static RAM with a latency of just one cycle. Initially, the program should be loaded into the instruction memory and its data should be loaded into the data memory.

The design and implementation should be done in a hardware description language such as Verilog, synthesized, and tested on an FPGA board.

Carry out a literature survey from the IEEE and ACM digital libraries, as well as the Internet about related processors and their implementation.

Write a paper (similar to a conference or journal paper) with a Title, Abstract, Introduction, Instruction Set Design, Datapath, Control, Hazards, Testing, Benchmarks, Synthesis to an FPGA chip, Performance Results, Conclusion, and References.

The paper should be submitted along with a PowerPoint presentation on Thursday, December 25 by 12 noon.

 

  Last Updated: Monday December 22, 2014, by Dr. Muhamed Mudawar