40. HIGH-PERFORMANCE
MIDDLEWARE
a. Scientific Data
Management and Understanding—Modern science is increasingly becoming a
data-intensive activity, with experiments in science areas such as high-energy
and nuclear physics, climate modeling, computational biology, and fusion energy
estimated to generate petabyte-scale of unstructured domain science data.
Given the projected wave of data and information, the importance of
managing scientific data and information is recognized as being in the critical
path of modern scientific endeavor. Accordingly,
grant applications are sought to develop: (1)
workflow for unstructured data management technologies to aid the construction
and automation of scientific problem-solving processes; (2) meta-data and data
description services to describe and track data within and across different
communities; (3) efficient data access and query technologies to handle the
organization of complex scientific data that is not based on simple relational
tables, as used in commercial systems; (4) scalable data storage and
distribution services and tools for data transmission over switched optical
links, data replication, and data discoveries; (5) high-speed data storage and
caching services to deal with high-performance data access, random I/O, and
dynamic data storage and caching; and (6) data analysis services to enable
next-generation scientific visualization, feature identification, and tracking.
Commercial database systems and their variants dealing with structured data are
beyond the scope of the subtopics and will be rejected without peer-review.
b. Scalable
I/O Sub-Systems for Petascale Data Distribution—Moving
data into and out of petascale systems quickly is critical to achieving high
performance. At the petascale, this
involves many hundreds to thousands of I/O channels from the compute nodes,
connected by a high speed switch fabric, to file servers.
Although switch performance is evolving rapidly, high performance
communications switches are not yet optimized for the kinds of loads that
petascale computers place on them. In
petascale applications, each switch port has a very high duty cycle (so
non-blocking architectures are preferred). Also,
the data flow is very directional, i.e., a set of ports "A" is always
exchanging data with a disjoint set "B", and the "A" ports
don't exchange data with other "A" ports.
Traffic management is also a problem at the petascale.
For example, there are typically more ("A") ports connected to
the compute nodes than to file servers ("B" ports), so when data is
being dumped from the petaflops system to files, it backs up on the input side
of the switch. This must result in
even throttling of throughput under high aggregate input load, another condition
that varies from the usual application of these switches.
In summary, there is a great need for switch hardware and software
optimization for petascale applications.
Questions – contact Thomas Ndousse-Fetter (tndousse@science.doe.gov)
c. Scalable and
Secure Services for Large-Scale Scientific Collaborations Scalable Middleware
Technologies—Grant applications are sought for the development and
maintenance of scalable middleware technologies that will (1) enable universal,
ubiquitous, easy access to remote computing resources and scientific
instruments; (2) facilitate collaboration among distributed science teams; and
(3) enable a new generation of distributed high-end applications of interest to
the DOE. The current interest in this area include but are not limited to 1)
long-term enhancement and maintenance of Access Grid facilities and grid
software, 2) scalable scientific workflow for large-scale science projects, 3)
scalable authentication/authorization services, 4) deployable
Questions – contact Thomas Ndousse-Fetter (tndousse@science.doe.gov)
References:
1.
Global Grid Forum Website, at http://www.ggf.org/
2.
“High-Performance Networks for High-Impact Science,” Report of
the High-Performance Network Planning Workshop,
3.
Foster,
4.
“DOE Science Networking - Roadmap to 2008,” Final Report,
2003. (Full text available at:
http://www.es.net/hypertext/welcome/pr/Roadmap/)
5.
“The Office of Science Data-Management Challenge,” Final
Report of a series of U.S. DOE Data-Management Workshops held March-May 2004.
(Full text available at: http://www.sc.doe.gov/ascr/Final-report-v26.pdf)
6.
Foster,
Return to the Complete List of Topics
| Program Information, Instructions and Requirements | Technical Topic Descriptions | Download Program Information | Download Technical Topics |