### Suggested Research Projects

Each UMSA student was required to research a topic of interest under the guidance of one of the faculty members and submit a written report at the conclusion of the research. Each faculty member provided one or more research topics from which the students could select. Following is a compilation of research topics suggested by the faculty. Links to the associated lecture, reading, presentation, and/or report (if any) accompany each project.

 Key to Associated Links: [lec] Lecture    [rdg] Reading [prs] Student Presentation    [rep] Final Student Report

### Computer Science

Algorithms
We have seen how two genome or protein sequences are evaluated for alignment potential. With multiple strings, the problem is much more complex since the "base" string need not be any one of the given strings. Develop and test algorithms for this problem, perhaps using simulated annealing or some algorithms from the literature. Possible summer extension: solve the K-best version of this problem (in which a rank-ordered list of the K-best solutions are given).
[lec] [rdg] [prs] [rep]
In this project, you will implement the simulated annealing algorithm, together with a heuristic of your own to solve the delivery-truck problem. In this problem, you are given customer demands, a delivery schedule, and a map. The goal is to schedule truck routes so that all deliveries can be made by their delivery times.
[lec] [rdg] [prs] [rep]
Implement the two algorithms for the facility location problem described in the paper by Simha et al. The general facility location problem is an optimization problem often found in urban planning, e.g., K-Mart wants to open k new stores in Virginia and wants to know, based on customer demand information, where to place these stores. Study the performance of the new heuristic for various parameters. Confirm the results in the paper. Try new heuristics (e.g., TABU search). Possible extension into summer project: use methods to solve a related "dispersion" problem.
[lec] [rdg] [prs] [rep]
A well-known problem in networks is involves the fair assignment of resources among users. In some situations, users that are separated by long distances receive less than their fair share. Construct a simulation of a simple network and then implement two or three mechanisms (faculty help will be provided) to ensure fairness and compare them. Possible summer extension: develop a multi-path version of the methods.
[lec] [rdg] [prs] [rep]
Apply simulated annealing to the converter-placement problem in optical networks (Note: No background in networks is needed to work on this problem). The placement problem involves deciding where to place expensive performance-enhancing "wavelength converters" around a network. Compare solutions obtained to published heuristics. Possible summer extension: develop a multi-path version of the problem.
[lec] [rdg] [prs] [rep]
In this project, you will implement two algorithms for the well-known Traveling Salesman problem. The first is the well-known Lin-Kernighan heuristic whereas the second is an unusual algorithm in that it uses ideas from the physics of elastic bands in deriving an algorithm.
[lec] [rdg] [prs] [rep]
In this project, you will implement the simulated annealing algorithm and the the simulated n-body algorithm for a location problem. These algorithms are described in a paper. Then, you will implement a visualization of the algorithm's progress in Java for a particular type of input data.
[lec] [rdg] [prs] [rep]
Discrete-state models
B-trees are a very useful data structure used to provide efficient access to a large number of items (for example, in data-base manipulation systems). Parallel algorithms for B-tree manipulation have been defined and are in use, but they require to lock the nodes on a path from the root to the leaf where the insertion should occur if the searched key is not found. We have defined a new algorithm that does not require locks on searches; only if a key needs to be inserted are locks performed. Since most operations in a B-tree are searches for existing keys, this has the potential to greatly speed-up the execution. The project is involved with providing a formal verification of the correctness of our algorithm using methods discussed in class.
[lec] [rdg] [prs] [rep]
Simulation
Similar to the research paper by Jones, study the performance of select appropriate priority queue implementations for discrete-event simulation. In this study, the student will relax the constraints of the hold model used in the paper in order to examine performance when the event set size is not constant.
[lec] [rdg] [prs] [rep]
This research project is intimately related to the paper by Riska et al. The student will simulate web server allocation policies that take into consideration Quality of Service.
[lec] [rdg] [prs] [rep]

### Mathematics

Applications
Many problems in probability can be solved easily by hand. When the problems are slightly modified, however, they become computationally difficult. The purpose of this project is to identify such probability problems and solve them with the computer algebra system Maple with the aid of a new language known as APPL (A Probability Programming Language).
[lec] [rdg] [prs] [rep]
The goal of this research project is to calculate the exact distribution of the time to traverse a stochastic activity network. The Critical Path Method (CPM) is used to determine the critical path in a deterministic activity network, and the Program Review and Evaluation Technique (PERT) is used to determine the completion time distribution in a stochastic activity network. PERT assumes that the activity durations each have a beta distribution, and uses the central limit theorem to approximate the project duration. The goal is to use APPL to determine the exact distribution of the time to complete any stochastic activity network with arbitrary continuous activity durations.
[lec] [rdg] [prs] [rep]
Estimation
The sample coefficient of variation is defined as the ratio of the sample standard deviation to the sample mean. It is an interesting quantity because it is unitless and its population value is equal to one when the population is exponentially distributed. This project considers the generation of algorithms for determining confidence intervals for the population coefficient of variation for various population distributions.
[lec] [rdg] [prs] [rep]
Design an algorithm for efficiently determining the form of a probability density function. Identifying characteristics would include the type of random variable of interest (discrete vs. continuous), support, and functional form.
[lec] [rdg] [prs] [rep]
Simulation-based estimation involves comparing a sample generated by an actual stochastic process with samples generated by stochastic simulation. The comparison is accomplished by a discrepancy function that quantifies how much one sample resembles another. Various functions are possible; for some examples, it may require some ingenuity to develop an appropriate one. This project will involve performing numerical experiments that compare the efficacy of various discrepancy functions and/or developing appropriate discrepancy functions for some unusual applications.
[lec] [rdg] [prs] [rep]
Extend the estimator described in the paper by Leemis to handle the situation associated with several realizations of a process with different ending times. For example, on Monday you record arrival times to a Burger King drive-up window from 11:00 to 1:00. On Tuesday, you record arrival times between 11:30 and 2:00. What is the appropriate way to estimate and simulate the process?
[lec] [rdg] [prs] [rep]
Develop general algorithms for determining confidence regions (a generalization of the notion of confidence intervals in more than one dimension) for unknown parameters given a data set of n observations. This project would require the use of Maple to generate code which would determine point and interval estimates for unknown parameters for a given data set.
[lec] [rdg] [prs] [rep]
Optimization
The numerical optimization community often recommends direct search algorithms for optimization in the presence of noise. However, recent theoretical results challenge the propriety of this recommendation. This project will explore the efficiency of direct search algorithms when function evaluation is uncertain. It will involve implementing (or modifying existing implementations of) several direct search algorithms and performing numerical experiments that document their performance.
[lec] [rdg] [prs] [rep]
An important theme of recent research on stochastic approximation is the premise that gradients should be estimated from as little information as possible. In contrast, it seems plausible that repeated sampling of the objective function might result in better estimates and more efficient algorithms, albeit at greater expense per iteration. Such an algorithm was proposed by Dupuis and Simha. The central issue is whether it is more efficient to do many inexpensive iterations or fewer iterations of greater expense. This project will involve implementing several stochastic approximation algorithms and performing numerical experiments intended to shed light on this issue.
[lec] [rdg] [prs] [rep]
Reliability
Right-censoring complicates the analysis of a data set of survival times. There is no known exact method for determining an interval estimate for a Type I (time) censored sample drawn from an exponential population. This project evaluates the various approximate methods via a large-scale Monte Carlo simulation.
[lec] [rdg] [prs] [rep]
Consider a single-server queue with exponential time between arrivals and exponential service times, i.e., an M/M/1 queue. Steady-state and transient results are well known for measures of performance such as the expected delay in queue when the mean time between arrivals and the mean service times are known, fixed constants. In most practical situations, however, these means are estimated from data. This research project investigates the effect of sampling error on simulation output measures of interest.
[lec] [rdg] [prs] [rep]

### Interdisciplinary Topics

Applications
Design and implement a multi-player cellular game. The objective is to create a game with agents in the 2D landscape of a cellular world in which each player's agents vie for dominance. Create a game in which winning strategies are not obvious (or even predictable). Possible summer extension: develop the game for public distribution.
[lec] [rdg] [prs] [rep]
The topic of this research is based on the paper by Bouras and Frayssé. The goal of this project is to understand the interaction between inner and outer iterative methods for linear systems of equations and/or eigenvalue problems, and when one must be stopped to achieve better performance/convergence.
[lec] [rdg] [prs] [rep]
Straight-forward algorithms for drawing a planar 2-D triangulation mesh from only angle information have proved numerically (and graphically) unstable. A student will investigate the source of this instability and device new, multilevel methods, that solve the problem.
[lec] [rdg] [prs] [rep]
Physical science
Beginning with "The Percolation Problem" chapter in the book by Gould and Tobochnik, extend the percolation model to 3-d, paying particular attention to the associated "cluster labeling algorithm". (See section 12.3).
[lec] [rdg] [prs] [rep]
Social science
Study disease transmission in the artificial society. Each agent has an immune system, baby agents inherent their immune system from their parents, there is a fixed "pool" of diseases that circulate within the artificial society, diseases are spread when the agents are in close contact, etc. When compared to a disease-free society, what is the impact of disease on the agent population?
[lec] [rdg] [prs] [rep]
Extend the agent model to include two types of agents, "hunters" and "gatherers". The gatherers roam the landscape collecting resource; the hunters roam the landscape looking for gatherers from which they will steal resource via "combat" (which can produce agent death). Under what conditions will the society be able to achieve a stable (non-zero) population of both hunters and gatherers?
[lec] [rdg] [prs] [rep]
Within the context of the artificial society model presented in class, study the effects of the landscape (including boundary conditions, resource distributions, discontinuities, and overall shape) on carrying capacity for a model with reproducing agents.
[lec] [rdg] [prs] [rep]
Extend the landscape model to include a 2nd resource, then study "trade" (economic interaction) within the society. That is, agents that have lots of one resource but little of the other will engage in trade with other agents that have an opposite distribution of resource wealth. What is the steady-state "price" at which this trade occur?
[lec] [rdg] [prs] [rep]