S3 - Searching, Selecting, and Synthesizing Source Code

SCUBE- Searching, Selecting, and Synthesizing Source Code Software developers rely on reusing source code snippets from existing libraries or applications to develop software features on time and within budget. The reality is that most previously implemented features are embedded in billions of lines of scattered source code. State-of-the-art code search engines provide no guarantee that retrieved code snippets implement these features. Even if relevant code fragments are located, developers face the rather complex task of selecting and moving these fragments into their applications. This research program proposes an integrated model for addressing the fundamental problems of searching, selecting, and synthesizing (S3) source code. The S3 model relies on integrating program analysis and information retrieval to produce transformative models to automatically search, select, and synthesize relevant source code fragments. The S3 model will directly support new methodologies for software change and automated tools that assist programmers with various development, reuse, and maintenance activities. Thus far, we have built three source code search engines. Exemplar is a search engine that combines information retrieval and program analysis techniques to reliably link high-level concepts to the source code of the software applications via standard API calls that these applications use. CLAN is an engine for computing similarities among software applications in large software repositories. Finally, Portfolio is a source code search engine that retrieves and visualizes relevant functions and their uses in 18,203 C/C++ software projects from over 260 million lines of code in FreeBSD Ports. This project is sponsored by the NSF.

Collaborative Research: Creating and Evolving Software via Searching, Selecting and Synthesizing Relevant Source Code
National Science Foundation Award CCF-0916260
Denys Poshyvanyk (PI @ W&M)
Mark Grechanik (PI @ UIC)
Dates: September 2009 - August 2012

SE2 - Software evolution, based on semantic and evolutionary information

SE2 - Software Evolution Based on Semantic and Evolutionary Information Software maintenance and evolution is a vital and resource consuming phase of the software lifecycle. Introducing software changes is a particularly complex phenomenon in case of long-lived, large-scale, and globally distributed systems. Years of research efforts have recognized three core tasks to support developers during software maintenance: feature location (a starting point of a change in source code), impact analysis (other software entities that are also change prone), and expert developer recommendations (appropriate developers to implement changes). The project will develop a novel one-stop solution for these tasks by integrating and mining the latent information cluttered in structured and unstructured software artifacts produced and constantly changed during evolution of software systems, which are largely untapped in current solutions. This project has three main goals: 1) Define a new integrated framework SE2 for a comprehensive analysis of software evolution, based on conceptual and evolutionary information, under a single umbrella, 2) Define new methodologies for software maintenance tasks based on SE2, and 3) Perform empirical studies to evaluate SE2 and supported methodologies. Central to our solution are the state of the art data mining, information retrieval, and program analysis methods. This project is sponsored by the NSF.

Collaborative Research: An Inductive Framework to Support Software Maintenance
National Science Foundation Award CCF-1016868
Denys Poshyvanyk (PI @ W&M)
Huzefa Kagdi (PI @ WSU)
Dates: August 2010 - July 2013

TraceLab - Traceability Instrument to Facilitate and Empower Traceability Research and Technology Transfer

TraceLab - Traceability Instrument to Facilitate and Empower Traceability Research and Technology Transfer The work will support a critical research agenda of the software engineering community and facilitates technology transfer of traceability solutions to business and industry. The traceability instrument, namely TraceLab, will contain a library of reusable trace algorithms and utilities, a benchmarked repository of trace-related datasets, tasks, metrics, and experimental results, a plug-and-play environment for conducting tracerelated experiments, and predefined experimental templates representing common types of empirical traceability experiments. The traceability instrument will also facilitate the application of traceability solutions across a broad range of software engineering activities including requirements analysis, architectural design, maintenance, reverse engineering, and independent verification and validation. This is a collaborative effort lead by Jane-Cleland Huang (PI), DePaul University, Jonathan Maletic (co-PI), Kent State University and Denys Poshyvanyk (co-PI), William and Mary. This project is sponsored by the NSF.

MRI-R2: Development of a Software Traceability Instrument to Facilitate and Empower Traceability Research and Technology Transfer
National Science Foundation Award CNS-0959924
Jane Cleland-Huang (PI @ DePaul)
Jonathan I. Maletic (co-PI @ KSU)
Denys Poshyvanyk (co-PI @ W&M)
Dates: June 2010 - May 2013

We gratefully acknowledge financial support from the NSF on this research project.