Lingfei Wu @ William & Mary

RESEARCH PROJECTS

A Preconditioned Hybrid SVD Method for Large-Scale problems

The computation of a few singular triplets of large, sparse matrices is a challenging task, especially when the smallest magnitude singular values are needed in high accuracy. Most recent efforts try to address this problem through variations of the Lanczos bidiagonalization method, but they are still challenged even for medium matrix sizes due to the difficulty of the problem.

We propose a novel SVD approach that can take advantage of preconditioning and of any well designed eigensolver to compute both largest and smallest singular triplets. Accuracy and efficiency is achieved through a hybrid, two-stage meta-method, PHSVDS. In the first stage, PHSVDS solves the normal equations up to the best achievable accuracy. If further accuracy is required, the method switches automatically to an eigenvalue problem with the augmented matrix. Thus it combines the advantages of the two stages, faster convergence and accuracy, respectively. For the augmented matrix, solving the interior eigenvalue is facilitated by a proper use of the good initial guesses from the first stage and an efficient implementation of the refined projection method. We also discuss how to precondition PHSVDS and to cope with some issues that arise. Numerical experiments illustrate the efficiency and robustness of the method.

Papers: SIAM SISC (Flagship Journal in Scientific Computing, 5-Year Impact Factor: 2.40), SC15 Poster, SC15 Doctoral Showcase
Software: PRIMME_SVDS (GitHub) (Providing C, Python, Matlab and Fortran interfaces)

Fast Trace Estimator for Large Sparse Matrix Inverse

Determining the trace of a matrix that is implicitly available through a function is a computationally challenging task that arises in a number of applications. For the common function of the inverse of a large, sparse matrix, the standard approach is based on a Monte Carlo method which converges slowly.

We present a different approach by exploiting the pattern correlation between the diagonal of the inverse of the matrix and the diagonal of some approximate inverse that can be computed inexpensively. We leverage various sampling and fitting techniques to fit the diagonal of the approximation to the diagonal of the inverse. Based on a dynamic evaluation of the variance, the proposed method can be used as a variance reduction method for Monte Carlo in some cases. Furthermore, the presented method may serve as a standalone kernel for providing a fast trace estimate with a small number of samples. An extensive set of experiments with various technique combinations demonstrates the effectiveness of our method in some real applications.

Papers: JCOMP (5-Year Impact Factor: 3.120), second-round review

Real Time Blob-Filaments Detection and Tracking in Fusion Plamsa Big Data

Magnetic fusion could provide an inexhaustible, clean, and safe solution to the global energy needs. The success of magnetically-confined fusion reactors demands steady-state plasma confinement which is challenged by the blob-filaments driven by the edge turbulence. Real-time analysis can be used to monitor the progress of fusion experiments and prevent catastrophic events. However, terabytes of data are generated over short time periods in fusion experiments. Timely access to and analyzing this amount of data demands properly responding to extreme scale computing and big data challenges.

In this paper, we apply outlier detection techniques to effectively tackle the fusion blob detection problem on extremely large parallel machines. We present a real-time region outlier detection algorithm to efficiently find blobs in fusion experiments and simulations. In addition, we propose an efficient scheme to track the movement of region outliers over time. We have implemented our algorithms with hybrid MPI/OpenMP and demonstrated the accuracy and efficiency of the proposed blob detection and tracking methods with a set of data from the XGC1 fusion simulation code. Our tests illustrate that we can achieve linear time speedup and complete blob detection in two or three milliseconds using Edison, a Cray XC30 system at NERSC.

Papers: IEEE Transaction on Big Data (TBD), second-round review. BDAC-14, SC14 Poster
Software: Big data analytics component in ICEE
HPC demo: SC14

Scale Up Large-Scale Kernel Machine Using Random Features for Speech Recognition

Recent evidences suggest that the performance of kernel methods may match that of deep neural networks (DNNs), which have been the state-of-the-art approach for speech recognition.

In this work, we present an improvement of the kernel ridge regression, and show that our proposal is computationally advantageous. Our approach performs classifications by using the one-vs-one scheme, which, under certain assumptions, reduces the costs of the one-vs-rest scheme by asymptotically a factor of c in training time and c in memory consumption. Here, c is the number of classes and it is typically on the order of hundreds and thousands for speech recognition. We demonstrate empirical results on the benchmark corpus TIMIT. In particular, the classification accuracy is one to two percentages higher (in the absolute term) than the best of the kernel methods and of the DNNs, and the speech recognition accuracy is highly comparable.

Papers: ICASSP 2016, KDD16 (in submission)
Software: coming soon.