King
Fahd
University
of
Petroleum &
Minerals
College of
Computer
Sciences &
Engineering
Department of Computer Engineering
|
COE 421: Fault Tolerant Computing (3-0-3)
Syllabus
Catalog Description
Introduction to fault tolerant computing (FTC). Goals of fault tolerance
(FT). Design techniques to achieve FT. Evaluation of FT systems.
Reliability modeling and analysis of FT systems. Availability modeling.
Design of FT VLSI?WSI circuits. Introduction to testing.
Prerequisite: COE 308 or equivalent.
Text Book:
D. Siewiorek and R. Swarz, ``Reliable Computer Systems: Design and Evaluation'',
Digital-Press, 3rd Edition, 1998.
Course Objectives:
-
- (1) Master the fundamental concepts in
fault-tolerant computing.
-
- (2) To Master application of the theory of reliability modeling and
evaluation
-
- (3) To Master designing reliable and fault tolerant
computer systems.
-
- (4) Appreciate the basic issues in yield enhancement
of VLSI/WSI circuits.
Learning Outcomes:
-
- (1) To introduce students to the fundamental concepts in
fault-tolerant computing.
-
- (2) To expose students to the theory of reliability modeling and
evaluation
-
- (3) To introduce students to the basic principles for designing reliable
computer systems.
-
- (4) To expose students to some of the commercially available fault tolerant/highly available systems.
-
- (5) To introduce students to the basic issues in yield enhancement
of VLSI/WSI circuits.
Topics:
- 1.
- Module 1: Introduction and Fundamental Concepts
(Chapter2 1 and 2)
Origins of FTC, Goals of FT, Applications of FTC, Faults, Errors,
Failures, Fault characterization, Fault modeling.
- 2.
- Module 2: Design Techniques to Achieve Fault Tolerance
(Cahpetr3 7 Appendix B)
Design issues, Hardware redundancy, Information redundancy, Time
redunadancy, Software redundancy.
- 3.
- Module 3: Evaluation Techniques
()
Quantitative evaluation methods, Reliability modeling, Safety modeling,
Availability modeling, maintainability modeling.
- 4.
- Module 4: Design of Practical Fault-Tolerant Systems
(Chapters 7-10)
The design process, Fault avoidance, Lonf-life applications,
Critical-computation applications, High-availability applications.
- 5.
- Module 5: FT Design of VLSI/WSI Circuits
(Chapter 3 and Appendix A)
Failure modes, Self-checking circuits, Reliability & Yield enhancement
of array processros.
- 6.
- Module 6: Introduction to Testing
(Chapter 4 and Appendix C)
Test pattern generation methods, Design-for-Testability, Testability
analysis.
Computer Usage:
Use of available reliability modeling and evaluation tools.
Laboratory Experiments:
None.
Grading Policy (Tentative):
30% Assignments & Quizzes
15% Major Exam I (Tentatively during week 5)
20% Major Exam II (Tentatively during week 10)
35% Final Exam (Scheduled by the Registrar)
ABET Category content:
Engineering Science: 50 %
Engineering Design: 50%
Prepared by: Prof. Mostafa Abd-El-Barr.
Date: November 2002.