First 1-4 weeks of semester: Students are expected to independently carry out literature Survey on their selected topics, compile information of relevant papers in a progress report PR-1, have an action plan of what they are going to contribution and more specifically on their project implementation, and prepare for a presentation to expose all the above aspects to the class and instructor. During this period, the student is expected to interact with the instructor regarding the major issues of his task while being prepared for his topic and its details. A 15-minute presentation is to be given in week 5 in front of the class and instructor. This part will be rated 25% of project for both presentation and timely submission of PR-1.
From week 5 to week 10, Students are expected to carry out the implementation part of their project by taking into account: (1) his presented or revised action plan, and (2) the instructor observations formulated during or after his presentation. During this period, the student is expected to interact with the instructor regarding the major design issues, difficulties in addressing some parts, resource problems, or any kind of problem that require specific attention to avoid negative impact on the project. Prepare report PR-2 after adding the implementation aspects presented above to P-1. A 15-minute presentation involving success and failure design and implementation aspects is to be given in week 11 in front of the class and instructor. This part will be rated 35% of project for both presentation and timely submission of PR-2.
From week 11 to week 14, Students are expected to carry out: (1) debugging and testing of their implementation and refer to instructor in case of problems, (2) carry out performance evaluation of the project with data collection, (3) revise PR-1 by including the performance evaluation and detailed analysis of collected results, (4) prepare the final project presentation. During this period, the student is expected to interact with the instructor regarding the debugging issues and performance interpretation. A presentation involving overall project will be delivered by the team (or student) before the last day of classes (week 15). The emphasis should be here on the method and results. This part will be rated 40% of project for both presentation and timely submission of PR-3 which is the final report. Here the student is to submit a zipped folder including all of original word reports (not pdf), well documented source code, evaluation data, and reference papers. A demo to the instructor will also be required.
List of projects for T091
The KFUPM-ITC has a High-Performance Computing System (HPC) which consists of a 128-node Dual-boot Cluster with following resources (see URL: http://www.kfupm.edu.sa/hpc/):
- 128 compute-node e1350 IBM eServer cluster.
- The cluster is unique in its dual-boot capability with Microsoft Windows HPC Server 2008 and Red Hat Enterprise Linux 5 operating systems.
- The cluster has 3 master nodes, one for Red Hat Linux, one for Windows HPC Server 2008 and one for cluster management.
- The cluster has 128 compute nodes.
- Each compute node of the cluster is dual-processor having two 2.0 GHz x3550 Xeon Quad-core E5405 processors.
- The total number of cores in the cluster is 1024.
- Each master node has 1 TB of hard disk space.
- Each compute node has 500 GB of hard disk space.
- Each master node has 8 GB of RAM.
- Each compute node has 4 GB of RAM.
- The interconnect is 10 GBASE-SR
Other important features:
The technical staff of the HPC can be contacted at hpc@kfupm.edu.sa
Background:
IBM Cluster Compiling Systems: A HPC cluster is provided with FORTRAN 77 (Portland Group compiler pgf77), Fortran 90 (pgf90), C (pgcc), and C++ (pgCC) compilers in addition to either of the Intel or / and Portland Group suites of optimizing compilers (faster code than that GNU compilers).Generally these different produce Linux executable for each of the Portland Group and Intel compilers. There are also compiler optimization options to produce more optimized executable codes.
Parallelization over one Multi-Core Node Acting as a Shared-Memory Multiprocessor: Users can automatically optimize single-node sequential programs for shared-memory parallel execution using the Portland Group -Mconcur or Intel -parallel compiler option (pgcc -O2 -Mconcur sample.c or icc -O2 -parallel sample.c). Both the Fortran and C/C++ compilers understand the OpenMP set of directives, which give the programmer a finer control over the parallelization. The -mp (Portland Group) and -openmp (Intel) compiler options activate translation of source-level OpenMP directives and pragmas. This allows compiling for OpenMP threaded execution, runs the executable using a number of threads on 1 node (multi-core), each thread on a separate core.
Parallelization over Distributed-Memory System using the Message Passing Interface (MPI): The system uses the MPICH implementation of the Message Passing Interface (MPI), generally optimized for the high-speed Infiniband interconnect. MPI is a standard library for performing parallel processing using a distributed-memory model. Each program file using MPI must. Near the beginning of each C or Fortran source file we must include the MPI header file. Generally an MPI wrapper scripts (Portland Group or Intel compilers) is loaded prior to executing the compilation command. The MPI compilers take the same options as the compiler they wrap. A command “mpiexec” (the -pernode for command line options) spawns one MPI process per CPU in a batch job. The -pernode option requests that one MPI process be spawned per node. These options are intended to be used for codes which mix MPI message passing with some form of shared memory programming model, such as OpenMP or POSIX threads. Finally a GNU debugger gdb is recommended for interactive or post analysis of sequential programs.
Resource on HPC:
The Ohio Supercomputing Center: http://www.osc.edu/supercomputing/training/ see documentation and courses.
Papers selected by the instructor: Download zipped folder
An online article on the internet explaining how to begin Parallel Programming With OpenMP in MS Visual Studio Professional C++ 2005/2008 or better. MS Visual Studio C++ supports the OpenMP 2.0 standard and provides various functions to set the OMP_NUM_THREADS Environment Variable for openMP such as to get the number of processors in the system [omp_get_num_procs()], to set the number of threads [omp_set_num_threads(int number)] and so on. The OpenMP C and C++ application program interface lets you write applications that effectively use multiple processors. By using OpenMP, you can gain performance on multi-core systems for free, without much coding other than a line or too. There is no excuse not to use OpenMP. The benefits are there, and the coding is simple. In the following please see the link to the article explaining how to begin Parallel Programming With OpenMP in MS Visual C++. http://www.codeproject.com/KB/cpp/BeginOpenMP.aspx
To see the directives, clauses and constructs used in the OpenMP API please visit the following link. The link will lead you to the openMP reference page in MSDN library from Microsoft. http://msdn.microsoft.com/en-us/library/tt15eb9t.aspx
The project is to:
Task-1: General about the IBM e1350 (20%)
Explore the e1350 IBM eServer cluster system architecture, processor node architecture and resource, inter-node interconnection, inter-node communication.
Write a document for other users on how to provide KFUPM desktop access to ITC-HPC, needed configuration and show some demos. You may conmtact technical staff from the ITC-KFUPM: Mr. Tariq Maghribi (Project head with phone 3979), Mr. Frahan (7325), and Mr. Nabeel (3910). See student report (PDF).
- Review of the C++ (pgCC) compiler in addition to either of the Intel or / and Portland Group (PG) suites of optimizing compilers (faster code than that GNU compilers) and report to the course. Make sure that the automatic optimization of single-node sequential programs for shared-memory parallel execution using the OpenMP set of directives under the PG or Intel compiler option is available and report to the course. In this case, one student is to study in details the OpenMP set of directives within in C programs. folder-1 and folder-2
- Review of the MPICH implementation of MPI (for Infiniband interconnect or what!) so that it can be integrated within each C or Fortran source. Make sure documentation is available. In this case, one student is to study in details MPICH library and its application in C programs.
Task-2: Implementation (80%).
Here the student is expected to parallelize one scientific application using one of the below listed packages:
Octave: GNU Octave is a high-level language, primarily intended for numerical computations. It provides a convenient command line interface for solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly compatible with Matlab. It may also be used as a batch-oriented language. Octave has extensive tools for solving common numerical linear algebra problems, finding the roots of nonlinear equations, integrating ordinary functions, manipulating polynomials, and integrating ordinary differential and differential-algebraic equations. It is easily extensible and customizable via user-defined functions written in Octave's own language, or using dynamically loaded modules written in C++, C, Fortran, or other languages. GNU Octave is a high-level language, primarily intended for numerical computations. It provides a convenient command line interface for solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly compatible with Matlab. It may also be used as a batch-oriented language. See:
Octave and Python- High-Level Scripting Languages Productivity and ...
Octave. High-level language for numerical computations — A Free ...
EGSnrc: The EGSnrc system is a package for the Monte Carlo (MC) simulation of coupled electron-photon transport. Its current energy range of applicability is considered to be 1keV - 10 GeV. EGSnrc is an extended and improved version of the EGS4 package originally developed at SLAC. It incorporates many improvements in the implementation of the condensed history technique for the simulation of charged particle transport and better low energy cross sections.
3. Gaussian with Linda: Gaussian 03 is used by chemists, chemical engineers, biochemists, physicists and others for research in established and emerging areas of chemical interest. Starting from the basic laws of quantum mechanics, Gaussian predicts the energies, molecular structures, and vibrational frequencies of molecular systems, along with numerous molecular properties derived from these basic computation types. It can be used to study molecules and reactions under a wide range of conditions, including both stable species and compounds which are difficult or impossible to observe experimentally such as short-lived intermediates and transition structures. Gaussian with Linda is the HPC version of Gaussian 03 software with parallel capabilities to run jobs in parallel on HPC clusters:
4. MPICH: It is a freely available, portable implementation of MPI, a standard for message-passing for distributed-memory applications used in parallel computing. MPICH is free software and is available for most flavors of Unix (including Linux and Mac OS X) and Microsoft Windows. Moreover, MPICH is a developed program library
5. Investigate the pMatlab (very expensive) parallel programming and develop one application. Search for an open access compiler for Matlab (if any), implement it on the IBM e1350, and use it to parallelize some applications (is is available at KFUPM? can we find an open source version?):
5. DL_POLY: It is a general purpose serial and parallel molecular dynamics simulation package developed at Dares bury Laboratory by W. Smith, T.R. Forester and I.T. Todorov. The original package was developed by the Molecular Simulation Group (now part of the Computational Chemistry Group, MSG) at Dares bury Laboratory under the auspices of the Engineering and Physical Sciences Research Council (EPSRC) for the EPSRC's Collaborative Computational Project for the Computer Simulation of Condensed Phases ( CCP5). The package is the property of the Central Laboratory of the Research Councils:
6. Other application packages
Working Student Groups: