Virginia Torczon and Michael Trosset
Optimization problems that arise in engineering design are often characterized by several features that hinder the use of standard nonlinear optimization techniques. Foremost among these features is that the functions used to define the engineering optimization problem usually require the solution of differential equations, a process which is itself computationally intensive. Within a standard nonlinear optimization algorithm, the solution of these differential equations is required for each iteration of the algorithm. To mitigate such expense, an attractive alternative is to replace the computationally intensive objective with a less expensive surrogate.
In conformance with engineering practice, we draw a crucial distinction between surrogate models and surrogate approximations. Surrogate models are auxiliary simulations that are less physically faithful, but also less computationally expensive, than the expensive simulation that is regarded as ``truth.'' An instructive example is the use of an equivalent-plate analysis method in lieu of a finite element analysis, e.g. to analyze a wing-box of a high-speed civil transport. Surrogate models exist independently of the expensive simulation and can provide new information about the physical phenomenon of interest without requiring additional runs of the expensive simulation.
Surrogate approximations are algebraic summaries obtained from previous runs of the expensive simulation. Examples include the low-order polynomials favored in response surface methodology (RSM) and the kriging estimates employed in the design and analysis of computer experiments (DACE). Once the approximation has been constructed, it is typically inexpensive to evaluate.
When surrogates are available, be they models or approximations, the optimizer hopes to use them to facilitate the search for a solution to the engineering optimization problem. Our ultimate goal is to design robust optimization algorithms, but we would like to do so in ways that allow us to make effective use of the information that good surrogates can provide. Toward this end, we adopt the perspective that the surrogate can be used to accelerate the optimization technique by exploiting the trends that such surrogates tend to identify. We do not worry about accuracy in the surrogate until it becomes clear either that the optimization technique is in the neighborhood of a minimizer or that the surrogate is not doing a sufficiently good job of identifying trends in the objective.
We consider a methodology that constructs a sequence of approximations to the objective. We concentrate on approaches such as DACE, that krige known values of the objective, but our general strategy is also amenable to other classes of approximations. We make use of pattern search techniques to handle the optimization, though other approaches are possible. We choose pattern search techniques because they can be easily amended to exploit surrogates, can handle functions that are nondifferentiable or for which derivatives are difficult or too expensive to attain, and can be easily adapted to either a parallel or distributed computing environment. Pattern search methods also are less likely to be trapped by non-global minimizers than are traditional nonlinear optimization algorithms. Furthermore, recent analysis extends their applicability to optimization problems with general nonlinear constraints.
Our approach synthesizes recent ideas from both the numerical optimization and the computer experiments literature. Given a limited budget of expensive function evaluations that are to be used to solve an engineering optimization problem, we consider how to manage the trade-off between the expense of approximation and the expense of optimization. We believe that one should invest only a modest proportion of the original budget in constructing the initial approximation and that one should use the optimization procedure to help decide when and where further sampling should occur.
An obvious concern is the relation between the upper bound on the number of expensive function evaluations allowed, which we will call V, and the dimension n of the problem to be solved. When V is large relative to n, say n = 10 and V = 5000, then the expense of function evaluation is not necessarily an issue and we are content to rely on traditional optimization techniques. When V is not large relative to n, say n = 10 and V = 20, then the emphasis must be on trying to construct a simple approximation from what little information one has about the objective--there can be no hope of successfully applying an iterative optimization technique to the problem. (If V is less than n, then the methodologies we consider are inappropriate.) We are concerned with intermediate situations.
Our experience shows that managing approximations, which should be updated as optimization progresses, engenders a somewhat different set of issues than managing models, each of which remains static as optimization progresses. Earlier related work did not envisions the sequential updating of a surrogate approximation, yet this sequential updating has proved to be the defining characteristic of much of our current research. We now understand that the sequential updating of surrogate approximations may create difficulties that are not encountered when either conventional optimization or the construction of the approximation are employed separately. Furthermore, the introduction of general constraints (as opposed to the special case of bound constraints), complicates not only the optimization, but also the construction of approximations.
We will conclude by discussing some of the issues that remain to be addressed. Currently, we advocate the introduction of merit functions that explicitly recognize the desirability of improving the current approximation to the expensive simulation. We also include suggestions for addressing the ill-conditioning that plagues kriging approximations constructed from sequential designs generated by optimization algorithms. In addition, we propose methods for obtaining space-filling designs for nonrectangular regions and propose pattern search algorithms for such regions. Ultimately, we wish to develop techniques for handling problems in which the constraints are implicitly defined by expensive simulations, but our investigation to date has focussed on problems for which the constraints are defined by algebraic expressions.