About Us Our Work Employment News & Events
MITRE Remote Access for MITRE Staff and Partners Site Map
edge top

Summer 2002
Volume 6
Number 2

 

Home > News & Events > MITRE Publications > The Edge >

Demonstrating the TeraGrid-A Distributed Supercomputer Machine Room by David Koester

In the future, distributed computing may move toward advanced grid computing: an infrastructure that enables the integrated, collaborative use of high-end computers, networks, databases, and scientific instruments owned and managed by multiple organizations at different locations. Grid computing has been a topic of academic research for many years because the amount of data collected by today’s scientific instruments, including satellites and remote sensors, far outstrips the capacities of even our fastest computers to analyze these data. MITRE’s sponsors may benefit from the research being performed on distributed grid supercomputing, because distributed sensor correlation and data fusion applications will require similarly massive amounts of computing power with multiple, spatially diverse data sources. Therefore, MITRE has been exploring the technologies used in grid computing and had an opportunity to take part in developing a demonstration of the most capable grid technology ever funded.

National Map of TeraGrid

A recent $53 million award from the National Science Foundation to four U.S. institutions will finance research into expanding grid technology to terascale levels—computing at speeds in excess of one trillion operations per second! The Distributed Terascale Facility (DTF), known as the TeraGrid, will be the largest, most powerful infrastructure ever deployed for unclassified scientific research. The original DTF proposal includes the National Center for Supercomputing Applications (NCSA) at the University of Illinois, the San Diego Supercomputer Center (SDSC), which heads the National Computational Science Alliance (NPACI), Argonne National Laboratory, and the California Institute of Technology (Caltech) (see map above). It will feature spatially distributed Linux cluster technology interconnected with very-high-bandwidth networking into a single virtual supercomputer machine room distributed across the four DTF sites. The TeraGrid will be the world’s most powerful distributed computing system, with a total of 13.6 teraflops (trillions of floating point operations [calculations] per second) of computing power. Its infrastructure will include facilities capable of managing and storing more than 450 terabytes of data, ultra-high-speed networks, high-resolution visualization environments, and toolkits for simplified grid computing.

The network infrastructure connecting these sites will have multiple 10-gigabit connections spanning more than half of the country. To add perspective here, the TeraGrid will have over 20,000 times the wide-area network capacity of common T1 wide area networking (WAN) connections.

As distributed terascale computing proves itself an effective tool for scientific research, more organizations will become part of the TeraGrid structure and connect via experimental networks. Many anticipate that the TeraGrid will rapidly evolve into the PetaGrid-an architecture that will support more than one quadrillion (1,000 trillion) operations per second. Any long and interesting journey commences with a single step, and the first step for the TeraGrid was to build a technology demonstration. MITRE has supported networking efforts at the annual International Conference for High Performance Networking and Computing for many years. In November 2001, a MITRE scientist headed the SC2001 Xnet (eXtreme Networks) effort-a specialized part of the overall networking efforts at this conference. Xnet showcases bleeding-edge, developmental networking technologies and experimental networking applications. MITRE engineers worked with researchers from the Department of Energy and academia to design and implement a demonstration of the TeraGrid distributed supercomputer machine room. MITRE’s participation in the Xnet TeraGrid demonstration showed that distributed machine rooms and distributed grid computing technologies have the potential to become an important source of computing resources in the near future. The demonstration of the proposed TeraGrid was one of the major attractions on the conference showfloor at SC2001. Separate computer clusters located in the four DTF partners’ exhibit booths were connected with 10-gigabit Ethernet and dense wavelength division multiplex technology. Two vendors supplied 10-gigabit Ethernet technology for test and evaluation (Figures 2 and 3). The Xnet researchers used commercial network monitoring gear to collect performance measurements on the 10-gigabit Ethernet links.

Figure 2: Computer Clusters with Gigabit Ethernet Interconnections (Cisco)

Figure 3: Computer Clusters with Gigabit Ethernet Interconnections (Nortel)

In the actual DTF, an optical network will carry the individual network connections between the Illinois and California sites on separate wavelengths. At SC2001 WAN PHY standards-based physical layer interfaces in the switches enabled us to demonstrate the feasibility of optically multiplexing multiple 10-gigabit Ethernet data streams on legacy hardware, because the WAN PHY interface standard is compatible with existing OC-192 Sonet physical layer optical multiplexing interfaces.

The number of 10-gigabit Ethernet interfaces available for the Xnet TeraGrid demonstration was limited because the switch modules used were pre-production and beta test modules. The actual TeraGrid will have 10-gigabit Ethernet-capable switches at each of the four sites, but we collapsed the network into having only two such switches. We used optical multiplexing gear to connect clustered computers in the Argonne and Caltech research exhibit booths to the Alliance and NPACI research exhibit booths, respectively, to uphold the eXtreme networking focus of the demonstration.

Among the applications that ran successfully during the demonstration were an SDSC toolkit for geographic modeling and analysis of biodiversity data, an Argonne application that enables researchers to visualize simulations that typically produce more than a terabyte of data, and an NCSA code used to study airflow during storms and hurricanes. These applications were developed to run on general grid architectures and further demonstrated the near-term viability of the TeraGrid concept.

In spite of the short (three-day) time frame available to demonstrate the Xnet TeraGrid, we succeeded in modeling this distributed terascale grid computing technology, and in fact encountered few problems with the proposed computing and networking technologies. Significant obstacles remain before the actual TeraGrid can be productive, but most of them appear to require the scaling of software to account for the massive computing potential of the TeraGrid. On the other hand, the hardware appears to be nearly ready for full-scale implementation of the TeraGrid and terascale grid computing, making it likely that the DTF can be built within a year of the Xnet demonstration. The 10-gigabit Ethernet technology is available today for those determined, early-adopting organizations that want to experiment with high-bandwidth, optical Ethernet. The hardware will be expensive at first, but prices should fall significantly over the next several years, especially as new vendors enter the market. In the longer term, as more toolkits and applications become available, the TeraGrid technology may provide a distributed computing infrastructure solution for MITRE’s sponsors in the military, intelligence, and aviation areas. We already envision applying it to such programs as the Joint Battlespace Infosphere, Distributed Sensor Correlation and Fusion, and homeland defense efforts, which depend on the availability of significant distributed computing resources, distributed storage, and spatially distributed specialized sensors.


For more information, please contact David Koester using the employee directory.


Homeland Security Center Center for Enterprise Modernization Command, Control, Communications and Intelligence Center Center for Advanced Aviation System Development

 
 
 

Solutions That Make a Difference.®
Copyright © 1997-2013, The MITRE Corporation. All rights reserved.
MITRE is a registered trademark of The MITRE Corporation.
Material on this site may be copied and distributed with permission only.

IDG's Computerworld Names MITRE a "Best Place to Work in IT" for Eighth Straight Year The Boston Globe Ranks MITRE Number 6 Top Place to Work Fast Company Names MITRE One of the "World's 50 Most Innovative Companies"
 

Privacy Policy | Contact Us