Project Title: High performance iterative methods on parallel computers
and distributed shared environments

Abstract:
The numerical solution of large, sparse systems of linear equations and eigenvalue problems is central to many scientific and engineering applications. Iterative methods, which do not modify the matrix, often provide the only means of solving these problems. Parallel computing is a powerful way of improving execution time and solvable problem size for these applications. Traditionally, implementations of iterative methods on parallel computers have adopted a fine grain allocation of equations to different processors.

Recent architectural and computational advances suggest that current fine-grain implementations may be inadequate. Specifically, clusters of workstations (COWs) are not well suited for fine grain methods, because of high network latencies. Even on massively parallel processors (MPPs) with fast proprietary networks, the synchronization overheads increase with the number of processors. In addition, partitioning a problem to a large number of processors reduces the local problem sizes, leading to load imbalances, and thus processor idling.

These problems are exacerbated as resource intensive parallel applications are increasingly making use of Computational Grids consisting of heterogeneous networks. In such environments, the resource contention from external, variable load makes traditional fine grain, static implementations inefficient and unpredictable. Achieving high performance, requires new levels of sophistication in parallel algorithms and in the interaction of the implementation with the runtime system.

The primary goal of the proposed research is to promote the state-of-the-art in high performance, parallel iterative methods by exploring algorithms that combine coarse and fine granularity, and dynamic resource utilization schemes. These objectives can be summarized as follows:

Build a new multigrain implementation level on top of traditional parallelization methods and preconditioners. This level exploits multivector (block) methods and some storage redundancy to introduce coarse granularity during the preconditioning step. The resulting codes will be made available to the community.
Exploit runtime system information to dynamically balance the available resources from within the application. The novelty of this approach is the identification of system aware iterative algorithms and algorithmic patterns that enable dynamic load balancing.

For more information email me at: My_First_Name @cs.wm.edu

Related publications

J. R. McCombs, R. T. Mills and A. Stathopoulos "Dynamic Load Balancing of an iterative eigensolver on Grids of heterogeneous clusters", submitted to International Conference on Distributed Computing Systems, (ICDCS 2002).
R. T. Mills, A. Stathopoulos and E. Smirni "Algorithmic modifications to the Jacobi-Davidson parallel eigensolver to dynamically balance external CPU and memory load", in Proceed. International Conference on Supercomputing 2001, Sorrento, Italy, June 18-22, (2001), 454--463.
A. Stathopoulos and J.R. McCombs, "A Parallel, Block, Jacobi-Davidson Implementation for Solving Large Eigenproblems on Coarse Grain Environments", in Proceed. International Conference on Parallel and Distributed Processing Techniques and Applications, Vol. VI, pp. 2920-26, (CSREA Press, 1999).
A. Stathopoulos and A. Ynnerman, "Dynamic Load Balancing of Atomic Structure Programs on a PVM cluster," Lecture Notes in Computer Science, 919 (1995) 384-391.
A. Stathopoulos and C. F. Fischer, "Reducing Synchronization on the Parallel Davidson Method for the Large, Sparse, Eigenvalue Problem", in: Proceed. Supercomputing '93, pp. 172-180 (ACM Press, Portland 1993).

Back to my home page

Project Title: High performance iterative methods on parallel computers and distributed shared environments

Related publications

Project Title: High performance iterative methods on parallel computers
and distributed shared environments