|
|
|
|
|
INVITED TALK (Friday, June 30, 9:00AM--10:00AM) Palais du Grand Large
Daniel A. Reed
Renaissance Computing Institute
Legend says that Archimedes remarked, on the discovery of the lever, "Give me a place to stand, and I can move the world." Today, computing pervades all aspects of science and engineering. "Science" and "computational science" have become largely synonymous, and computing is the intellectual lever that opens the pathway to discovery. As new discoveries increasingly lie at the interstices of traditional disciplines, computing is also the enabler for a scholarship in the arts, humanities, creative practice and public policy. Equally importantly, computing is an enabler of our critical infrastructure, from monetary and communication systems to the electric power grid.
With such pervasive dependence, computing system reliability and performance are ever more critical. Although the mean time before failure (MTBF) of commodity hardware components (i.e., processors, disks, memories, power supplies and networks) is high, their use in highly parallel, mission critical systems can still lead to systemic failures. In contrast, distributed software for networks, whether transport protocols or web/Grid services, is designed to be resilient to component failures. Our thesis is that these "two worlds" of software -- distributed systems and parallel systems -- must meet, embodying ideas from each, if we are to build resilient systems. This talk surveys some of these challenges and presents possible approaches for high-performance, resilient design, ranging from intelligent hardware monitoring and adaptation, through low-overhead recovery schemes, statistical sampling and differential scheduling and to alternative models of system software, including evolutionary adaptation.
Dan Reed is the Chancellor's Eminent Professor at the University of
North Carolina at Chapel Hill, as well as the Director of the
Renaissance Computing Institute (RENCI), a venture supported by the
three universities: the University of North Carolina at Chapel Hill,
Duke University and North Carolina State University that is exploring
the interactions of computing technology with the sciences, arts and
humanities. Reed also serves as Vice-Chancellor for Information
Technology and Chief Information Officer for the University of North
Carolina at Chapel Hill.
Dr. Reed is chair of the board of directors for the Computing Research
Association, which represents the interests of the major academic
departments and industrial research laboratories. He recently completed
a term of service as a member of President George W. Bush's Information
Technology Advisory Committee (PITAC), where he chaired the subcommittee
on computational science. He was previously Director of the National
Center for Supercomputing Applications (NCSA) at the University of
Illinois at Urbana-Champaign, where he also led National Computational
Science Alliance, a consortium of roughly fifty academic institutions
and national laboratories that is developing next-generation software
infrastructure of scientific computing. He was also one of the
principal investigators and chief architect for the NSF TeraGrid. He
received his PhD in computer science in 1983 from Purdue University.
|
|