Project Title: High performance iterative methods on parallel
computers
and distributed shared environments
Abstract:
The numerical solution of large, sparse systems of linear equations and
eigenvalue problems is central to many scientific and
engineering applications.
Iterative methods, which do not modify the matrix, often provide
the only means of solving these problems.
Parallel computing is a powerful way of improving execution time and
solvable problem size for these applications.
Traditionally, implementations of iterative methods on parallel computers
have adopted a fine grain allocation of equations to different processors.
Recent architectural and computational advances suggest that current
fine-grain implementations may be inadequate.
Specifically, clusters of workstations (COWs) are not well suited for
fine grain methods, because of high network latencies.
Even on massively parallel processors (MPPs) with fast proprietary networks,
the synchronization overheads increase with the number of processors.
In addition, partitioning a problem to a large number of processors
reduces the local problem sizes, leading to load imbalances,
and thus processor idling.
These problems are exacerbated as resource intensive parallel applications
are increasingly making use of Computational Grids consisting of
heterogeneous networks.
In such environments, the resource contention from external, variable load
makes traditional fine grain, static implementations inefficient
and unpredictable.
Achieving high performance, requires new levels of sophistication
in parallel algorithms and in
the interaction of the implementation with the runtime system.
The primary goal of the proposed research is to promote the
state-of-the-art in high performance, parallel iterative methods
by exploring algorithms that combine coarse and fine granularity,
and dynamic resource utilization schemes.
These objectives can be summarized as follows:
- Build a new multigrain implementation level
on top of traditional parallelization methods and preconditioners.
This level exploits multivector (block) methods and some storage
redundancy to introduce coarse granularity during the preconditioning
step. The resulting codes will be made available to the community.
- Exploit runtime system information to dynamically balance
the available resources from within the application.
The novelty of this approach is the identification of system aware
iterative algorithms and algorithmic patterns that enable dynamic
load balancing.
For more information email me at: My_First_Name @cs.wm.edu
Related publications
- J. R. McCombs, R. T. Mills and A. Stathopoulos
"Dynamic Load Balancing of an iterative eigensolver on
Grids of heterogeneous clusters",
submitted to International Conference on Distributed Computing Systems,
(ICDCS 2002).
- R. T. Mills, A. Stathopoulos and E. Smirni
"Algorithmic modifications to the Jacobi-Davidson parallel
eigensolver to dynamically balance external CPU and memory load",
in Proceed. International Conference on Supercomputing 2001,
Sorrento, Italy, June 18-22, (2001), 454--463.
- A. Stathopoulos and J.R. McCombs,
"A Parallel, Block, Jacobi-Davidson Implementation for Solving Large
Eigenproblems on Coarse Grain Environments", in Proceed.
International Conference on Parallel and Distributed
Processing Techniques and Applications, Vol. VI, pp. 2920-26,
(CSREA Press, 1999).
- A. Stathopoulos and A. Ynnerman,
"Dynamic Load Balancing of Atomic
Structure Programs on a PVM cluster,"
Lecture Notes in Computer Science, 919 (1995) 384-391.
- A. Stathopoulos and C. F. Fischer,
"Reducing Synchronization on the
Parallel Davidson Method for the Large, Sparse, Eigenvalue Problem",
in: Proceed. Supercomputing '93, pp. 172-180 (ACM Press, Portland 1993).
Back to my home page