39. SCALABLE SYSTEM SOFTWARE FOR PETASCALE COMPUTER SYSTEMS

High-performance computing (HPC) research in the Office of Science at the U.S. Department of Energy supports research that contributes to comprehensive, scalable, and robust computing to enable scientific discoveries.  The HPC currently supports research and development that focus on petascale computing systems - computers operate 1000 times faster than today’s petascale systems.  The primary areas of research include scalable system software, scientific visualization systems, data management tools, programming models, and related issues.  Grant applications addressing these issues are sought in the following subtopics:

a. Petascale System Software—Emerging large-scale science endeavors increasingly call for extreme-scale supercomputing systems.  These systems, which will exploit tens to hundreds of thousands of processors, will be based on a variety of challenging architectures from distributed memory clusters of unprecedented scale to radically different innovative architectural concepts such as PIMs, FPGAs, and complex memory hierarchies.  This requirement can be met by internal parallel I/O subsystems that comprise dedicated I/O nodes, each with processor, memory, and disks.  Massively parallel processors (MPPs), encompassing from tens to thousands of processors, are emerging as a major architecture for high-performance computers. The new supercomputing systems will differ greatly in scale and complexity from today’s systems, placing new and challenging demands on system software and related supporting hardware subsystems.  Grant applications for these proposed system software components and hardware subsystems must address the needs for:  1) optical transceiver development to improve CPU to CPU and CPU to memory bandwidth performance over copper based solutions, 2) operating systems tools and support for the effective management of terascale systems and beyond; and (3) effective tools for feature identification, (4) parallel and network I/O, and 5) scheduler, lightweight communication mechanisms, and queue management tools, 6) FPGA algorithm accelerator development that maximizes the performance of specific algorithms through a direct connection to the network infrastructure.

Questions – contact Thomas Ndousse-Fetter (tndousse@science.doe.gov

b. Petascale File Systems—Global parallel file systems such as GPFS and Lustre are widely used in the Office of Science to manage file systems in its major computer systems with few thousand processors.  This subtopic supports the development of file systems that can scale to thousands of processors.  This can be achieved by scaling existing file systems or developing new ones.  It is well understood that the bandwidth to storage devices is not keeping pace with computational trends and that the gap will continue to widen in the future.  A balanced petascale computer with 100,000 processors will require on the order of 1 Terabytes per second (TB/s) bandwidth.  In order to efficiently utilize Petascale computing resources to provide breakthrough science, proposals are sought that address the scaling, performance, and/or stability of an existing or new global parallel file system.  Grant applications are sought to develop scalable parallel file systems that explore the use of clustered metadata and metadata checksum/mirroring to handle up to one trillion files in a file system; address the scaling, performance, and/or stability of an existing global parallel file system; and to develop I/O disk and client services to bind the global file systems to storage systems and petascale computing systems.

Questions – contact Thomas Ndousse-Fetter (tndousse@science.doe.gov

c. Debugging and Performance Monitoring of Petascale Systems—Current supercomputing systems consisting of thousands of nodes cannot meet the demands of emerging high-performance scientific applications.  As a result, a new generation of supercomputing systems consisting of hundreds of thousands of nodes is being proposed.  However, these systems are likely to experience far more frequent failures than today's systems, and such failures must be tackled effectively.  Coordinated check-pointing is a common technique to deal with failures in petascale computing system.  Unlike most of the existing check-pointing models, the proposed model takes into account failures during check-pointing and recovery, as well as correlated failures.  The parallel debugging solution today, Total View, does not work for users above 1,000 tasks, and only works on one near HPC system beyond 1,000 nodes.  Debugging of large scale scientific applications with up to 100,000 interdependent parallel tasks requires renewed exploration of alternative approaches to debugging at massive concurrency. Grant applications are sought for relative debugging development which provides a promising new debugging paradigm for large systems and tens of thousands of processes, fixed block disk development to accelerate performance, improve reliability, and make cost improvements.

Questions – contact Thomas Ndousse-Fetter (tndousse@science.doe.gov

References:

1.      Aguilera, M. K., et al., “Failure Detection and Consensus in the Crash-Recovery Model,” Distributed Computing, 13(2): 99-125, April 2000.  (Summary available at:  http://www.liafa.jussieu.fr/web9/manifsem/description_en.php?idcongres=129)

2.      Butler , G., et al., “GUPFS: The Global Unified Parallel File System Project at NERSC,” Proceedings of the 21st IEEE/12th NASA Goddard Conference on Mass Storage Systems and Technologies, pp. 361-371, April 2004.  (Full text available at:  http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20040121020_2004117345.pdf)

3.      Williams, E., et al., “The Characterization of Two Scientific Workloads Using the Cray X-Mp Performance Monitor,” Proceedings of Supercomputing '90, pp. 142-152, IEEE, 1990.  (See:  http://portal.acm.org/citation.cfm?id=110382.110420)

4.      Fagg, G. E., et al., “Scalable Fault Tolerant MPI:  Extending the Recovery Algorithm,” Lecture Notes in Computer Science, Volume 3666 – Recent Advances in Parallel Virtual Machine and Messaging Passing Interface Users' Group Meeting Euro PVMMPI 2005, pp 67-75, Springer Heidelberg, 2005.  (ISSN:  0302-9743) (Full text available at:  http://icl.cs.utk.edu/projectsfiles/rib/pubs/sftmpi-europvm-mpi-2005.pdf)

5.      “National Leadership Computing Facility:  A Partnership in Computational Sciences,” U.S. DOE Oak Ridge National Laboratory Website, at http://www.ccs.ornl.gov/nlcf/

 

Return to the Complete List of Topics

 

Program Information, Instructions and Requirements  |  Technical Topic Descriptions  |  Download Program Information  | Download Technical Topics |