Each UMSA student was required to research a topic of interest under the
guidance of one of the faculty members and submit a written report at the
conclusion of the research.
Each faculty member provided one or more research topics from which the
students could select.
Following is a compilation of research topics suggested by the faculty.
Links to the associated lecture, reading, presentation, and/or report (if any)
accompany each project.

Key to Associated Links:[lec] Lecture [rdg] Reading [prs] Student Presentation [rep] Final Student Report |

Algorithms

We have seen how two genome or protein sequences are evaluated for alignment
potential. With multiple strings, the problem is much more complex since the
"base" string need not be any one of the given strings. Develop and test
algorithms for this problem, perhaps using simulated annealing or some
algorithms from the literature. Possible summer extension: solve the K-best
version of this problem (in which a rank-ordered list of the K-best
solutions are given).

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

In this project, you will implement the simulated annealing algorithm,
together with a heuristic of your own to solve the delivery-truck problem.
In this problem, you are given customer demands, a delivery schedule, and a
map. The goal is to schedule truck routes so that all deliveries can be made
by their delivery times.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Implement the two algorithms for the facility location problem described in
the paper by Simha et al. The general facility location problem is an
optimization problem often found in urban planning, e.g., K-Mart wants to
open k new stores in Virginia and wants to know, based on customer demand
information, where to place these stores. Study the performance of the new
heuristic for various parameters. Confirm the results in the paper. Try new
heuristics (e.g., TABU search). Possible extension into summer project: use
methods to solve a related "dispersion" problem.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

A well-known problem in networks is involves the fair assignment of
resources among users. In some situations, users that are
separated by long distances receive less than their fair share. Construct a
simulation of a simple network and then implement two or three mechanisms
(faculty help will be provided) to ensure fairness and compare them.
Possible summer extension: develop a multi-path version of the methods.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Apply simulated annealing to the converter-placement problem in optical
networks (Note: No background in networks is needed to work on this
problem). The placement problem involves deciding where to place expensive
performance-enhancing "wavelength converters" around a network. Compare
solutions obtained to published heuristics. Possible summer extension:
develop a multi-path version of the problem.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

In this project, you will implement two algorithms for the well-known
Traveling Salesman problem. The first is the well-known Lin-Kernighan
heuristic whereas the second is an unusual algorithm in that it uses ideas
from the physics of elastic bands in deriving an algorithm.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

In this project, you will implement the simulated annealing algorithm and
the the simulated n-body algorithm for a location problem. These algorithms
are described in a paper. Then, you will implement a visualization of the
algorithm's progress in Java for a particular type of input data.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Discrete-state models

B-trees are a very useful data structure used to provide efficient access to
a large number of items (for example, in data-base manipulation systems).
Parallel algorithms for B-tree manipulation have been defined and are in
use, but they require to lock the nodes on a path from the root to the leaf
where the insertion should occur if the searched key is not found. We have
defined a new algorithm that does not require locks on searches; only if a
key needs to be inserted are locks performed. Since most operations in a
B-tree are searches for existing keys, this has the potential to greatly
speed-up the execution. The project is involved with providing a formal
verification of the correctness of our algorithm using methods discussed in
class.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Simulation

Similar to the research paper by Jones, study the performance of select
appropriate priority queue implementations for discrete-event simulation.
In this study, the student will relax the constraints of the hold model used
in the paper in order to examine performance when the event set size is not
constant.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

This research project is intimately related to the paper by Riska et al.
The student will simulate web server allocation policies that take into
consideration Quality of Service.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Applications

Many problems in probability can be solved easily by hand. When the problems
are slightly modified, however, they become computationally difficult. The
purpose of this project is to identify such probability problems and solve
them with the computer algebra system Maple with the aid of a new language
known as APPL (A Probability Programming Language).

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

The goal of this research project is to calculate the exact distribution of
the time to traverse a stochastic activity network. The Critical Path
Method (CPM) is used to determine the critical path in a deterministic
activity network, and the Program Review and Evaluation Technique (PERT) is
used to determine the completion time distribution in a stochastic activity
network. PERT assumes that the activity durations each have a beta
distribution, and uses the central limit theorem to approximate the project
duration. The goal is to use APPL to determine the exact distribution of
the time to complete any stochastic activity network with arbitrary
continuous activity durations.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Estimation

The sample coefficient of variation is defined as the ratio of the sample
standard deviation to the sample mean. It is an interesting quantity
because it is unitless and its population value is equal to one when the
population is exponentially distributed. This project considers the
generation of algorithms for determining confidence intervals for the
population coefficient of variation for various population distributions.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Design an algorithm for efficiently determining the form of a probability
density function. Identifying characteristics would include the type of
random variable of interest (discrete vs. continuous), support, and
functional form.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Simulation-based estimation involves comparing a sample generated by an
actual stochastic process with samples generated by stochastic simulation.
The comparison is accomplished by a discrepancy function that quantifies how
much one sample resembles another. Various functions are possible; for some
examples, it may require some ingenuity to develop an appropriate one. This
project will involve performing numerical experiments that compare the
efficacy of various discrepancy functions and/or developing appropriate
discrepancy functions for some unusual applications.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Extend the estimator described in the paper by Leemis to handle the
situation associated with several realizations of a process with different
ending times. For example, on Monday you record arrival times to a Burger
King drive-up window from 11:00 to 1:00. On Tuesday, you record arrival
times between 11:30 and 2:00. What is the appropriate way to estimate and
simulate the process?

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Develop general algorithms for determining confidence regions (a
generalization of the notion of confidence intervals in more than one
dimension) for unknown parameters given a data set of *n* observations.
This project would require the use of Maple to generate code which would
determine point and interval estimates for unknown parameters for a given
data set.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Optimization

The numerical optimization community often recommends direct search
algorithms for optimization in the presence of noise. However, recent
theoretical results challenge the propriety of this recommendation. This
project will explore the efficiency of direct search algorithms when
function evaluation is uncertain. It will involve implementing (or
modifying existing implementations of) several direct search algorithms and
performing numerical experiments that document their performance.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

An important theme of recent research on stochastic approximation is the
premise that gradients should be estimated from as little information as
possible. In contrast, it seems plausible that repeated sampling of the
objective function might result in better estimates and more efficient
algorithms, albeit at greater expense per iteration. Such an algorithm was
proposed by Dupuis and Simha. The central issue is whether it is more
efficient to do many inexpensive iterations or fewer iterations of greater
expense. This project will involve implementing several stochastic
approximation algorithms and performing numerical experiments intended to
shed light on this issue.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Reliability

Right-censoring complicates the analysis of a data set of survival times.
There is no known exact method for determining an interval estimate for a
Type I (time) censored sample drawn from an exponential population. This
project evaluates the various approximate methods via a large-scale Monte
Carlo simulation.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Consider a single-server queue with exponential time between arrivals and
exponential service times, i.e., an M/M/1 queue. Steady-state and transient
results are well known for measures of performance such as the expected
delay in queue when the mean time between arrivals and the mean service
times are known, fixed constants. In most practical situations, however,
these means are estimated from data. This research project investigates the
effect of sampling error on simulation output measures of interest.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Applications

Design and implement a multi-player cellular game. The objective is to
create a game with agents in the 2D landscape of a cellular world in which
each player's agents vie for dominance. Create a game in which winning
strategies are not obvious (or even predictable). Possible summer
extension: develop the game for public distribution.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

The topic of this research is based on the paper by Bouras and Frayssé.
The goal of this project is to understand the interaction between inner and
outer iterative methods for linear systems of equations and/or eigenvalue
problems, and when one must be stopped to achieve better
performance/convergence.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Straight-forward algorithms for drawing a planar 2-D triangulation mesh from
only angle information have proved numerically (and graphically) unstable. A
student will investigate the source of this instability and device new,
multilevel methods, that solve the problem.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Physical science

Beginning with "The Percolation Problem" chapter in the book by Gould and
Tobochnik, extend the percolation model to 3-d, paying particular attention
to the associated "cluster labeling algorithm". (See section 12.3).

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Social science

Study disease transmission in the artificial society. Each agent has an
immune system, baby agents inherent their immune system from their parents,
there is a fixed "pool" of diseases that circulate within the artificial
society, diseases are spread when the agents are in close contact, etc. When
compared to a disease-free society, what is the impact of disease on the
agent population?

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Extend the agent model to include two types of agents, "hunters" and
"gatherers". The gatherers roam the landscape collecting resource; the
hunters roam the landscape looking for gatherers from which they will steal
resource via "combat" (which can produce agent death). Under what conditions
will the society be able to achieve a stable (non-zero) population of both
hunters and gatherers?

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Within the context of the artificial society model presented in class, study
the effects of the landscape (including boundary conditions, resource
distributions, discontinuities, and overall shape) on carrying capacity for
a model with reproducing agents.

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]

Extend the landscape model to include a 2nd resource, then study "trade"
(economic interaction) within the society. That is, agents that have lots of
one resource but little of the other will engage in trade with other agents
that have an opposite distribution of resource wealth. What is the
steady-state "price" at which this trade occur?

[lec] [rdg] [prs] [rep]

[lec] [rdg] [prs] [rep]