$LastChangedDate: 2008-04-28 09:36:26 +0200 (lun, 28 avr 2008)
$optimnon-linear optimization routineCalling Sequence[f,xopt]=optim(costf,x0)
[f [,xopt [,gradopt [,work]]]]=optim(costf [,<contr>],x0 [,algo] [,df0 [,mem]] [,work] [,<stop>] [,<params>] [,imp=iflag])Parameterscostfexternal, i.e Scilab function list or string
(costf is the cost function, that is, a Scilab
script, a Fortran 77 routine or a C function.x0real vector (initial value of variable to be
minimized).fvalue of optimal cost
(f=costf(xopt))xoptbest value of x found.<contr>keyword representing the following sequence of arguments:
'b',binf,bsup with binf and
bsup are real vectors with same dimension as
x0. binf and
bsup are lower and upper bounds on
x.algo'qn' : quasi-Newton (this is the
default solver)'gc' : conjugate gradient'nd' : non-differentiable.Note that the conjugate gradient solver does not accept
bounds on x.df0real scalar. Guessed decreasing of f at
first iteration. (df0=1 is the default
value).mem :integer, number of variables used to approximate the Hessian.
Default value is 10. This feature is available for the
Gradient-Conjugate algorithm "gc" without constraints and the
non-smooth algorithm "nd" without constraints.<stop>keyword representing the sequence of optional parameters
controlling the convergence of the algorithm. 'ar',nap
[,iter [,epsg [,epsf [,epsx]]]]"ar"reserved keyword for stopping rule selection defined as
follows:napmaximum number of calls to costf
allowed (default is 100).itermaximum number of iterations allowed (default is
100).epsgthreshold on gradient norm.epsfthreshold controlling decreasing of
fepsxthreshold controlling variation of x.
This vector (possibly matrix) of same size as
x0 can be used to scale
x.<params>keyword representing the method to initialize the arguments
ti, td passed to the objective function, provided
as a C or Fortran routine. This option has no meaning when the cost
function is a Scilab script. <params> can be set to only one
of the following values."in"That mode allows to allocate memory in the internal Scilab
workspace so that the objective function can get arrays with the
required size, but without directly allocating the memory. "in"
stands for "initialization". In that mode, before the value and
derivative of the objective function is to be computed, there is
a dialog between the optim Scilab primitive and the objective
function. In this dialog, the objective function is called two
times, with particular values of the "ind" parameter. The first
time, ind is set to 10 and the objective function is expected to
set the nizs, nrzs and ndzs integer parameters of the "nird"
common.This allows Scilab to allocate memory inside its internal
workspace. The second time the objective function is called, ind
is set to 11 and the objective function is expected to set the
ti, tr and tz arrays. After this initialization phase, each time
it is called, the objective function is ensured that the ti, tr
and tz arrays which are passed to it have the values that have
been previously initialized."ti",valtiIn this mode, valti is expected to be a Scilab vector
variable containing integers. Whenever the objective function is
called, the ti array it receives contains the values of the
Scilab variable."td", valtdIn this mode, valtd is expected to be a Scilab vector
variable containing double values. Whenever the objective
function is called, the td array it receives contains the values
of the Scilab variable."ti",valti,"td",valtdThis mode combines the two previous.The ti, td arrays may be used so that the
objective function can be computed. For example, if the objective
function is a polynomial, the ti array may may be used to store the
coefficients of that polynomial.Users should choose carefully between the "in" mode and the
"ti" and "td" mode, depending on the fact that the arrays are Scilab
variables or not. If the data is available as Scilab variables, then
the "ti", valti, "td", valtd mode should be chosen. If the data is
available directly from the objective function, the "in" mode should
be chosen. Notice that there is no "tr" mode, since, in Scilab, all
real values are of "double" type.If neither the "in" mode, nor the "ti", "td" mode is chosen,
that is, if <params> is not present as an option of the optim
primitive, the user may should not assume that the ti,tr and td
arrays can be used : reading or writing the arrays may generate
unpredictable results."imp=iflag"named argument used to set the trace mode. The possible values
for iflag are 0,1,2 and >2. Use this option with caution : most
of these reports are written on the Scilab standard output.iflag=0: nothing (except errors) is reported (this is the
default),iflag=1: initial and final reports,iflag=2: adds a report per iteration,iflag>2: add reports on linear search.iflag<0: calls the cost function with ind=1 every -imp iterations.gradoptgradient of costf at
xoptworkworking array for hot restart for quasi-Newton method. This
array is automatically initialized by optim when
optim is invoked. It can be used as input
parameter to speed-up the calculations.DescriptionNon-linear optimization routine for programs without constraints or
with bound constraints:costf is an "external" i.e a Scilab function, a
list or a string giving the name of a C or Fortran routine (see
"external"). This external must return the value f of
the cost function at the point x and the gradient
g of the cost function at the point
x.- Scilab function caseIf costf is a Scilab function, the calling
sequence for costf must be:Here, costf is a function which returns
f, value (real number) of cost function at
x, and g, gradient vector of
cost function at x. The variable
ind is described below.- List caseIf costf is a list, it should be of the
form: list(real_costf, arg1,...,argn) with
real_costf a Scilab function with calling
sequence : [f,g,ind]=costf(x,ind,arg1,... argn).
The x, f,
g, ind arguments have the same
meaning that above. argi arguments can be used to
pass function parameters.- String caseIf costf is a character string, it refers
to the name of a C or Fortran routine which must be linked to
Scilab* Fortran caseThe interface of the Fortran subroutine computing the
objective must be :with the following declarations:The argument ind is described
below.If ind = 2, 3 or 4, the inputs of the routine are :
x, ind, n, ti, tr,td.If ind = 2, 3 or 4, the outputs of the routine are :
f and g.* C caseThe interface of the C function computing the objective
must be :The argument ind is described
below.The inputs and outputs of the function are the same as
in the fortran case.If ind=2 (resp. 3, 4),
costf must provide f (resp.
g, f and g).If ind=1 nothing is computed (used for display
purposes only).On output, ind<0 means that
f cannot be evaluated at x and
ind=0 interrupts the optimization.Example #1 : Scilab functionThe following is an example with a Scilab function. Notice, for
simplifications reasons, the Scilab function "cost" of the following
example computes the objective function f and its derivative no matter of
the value of ind. This allows to keep the example simple. In practical
situations though, the computation of "f" and "g" may raise performances
issues so that a direct optimization may be to use the value of "ind" to
compute "f" and "g" only when needed.Example #2 : C functionThe following is an example with a C function, where a C source code
is written into a file, dynamically compiled and loaded into Scilab, and
then used by the "optim" solver. The interface of the "rosenc" function is
fixed, even if the arguments are not really used in the cost function.
This is because the underlying optimization solvers must assume that the
objective function has a known, constant interface. In the following
example, the arrays ti and tr are not used, only the array "td" is used,
as a parameter of the Rosenbrock function. Notice that the content of the
arrays ti and td are the same that the content of the Scilab variable, as
expected.'
'double sq(double x)'
'{ return x*x;}'
'void rosenc(int *ind, int *n, double *x, double *f, double *g, '
' int *ti, float *tr, double *td)'
'{'
' double p;'
' int i;'
' p=td[0];'
' if (*ind==2||*ind==4) {'
' *f=1.0;'
' for (i=1;i<*n;i++)'
' *f+=p*sq(x[i]-sq(x[i-1]))+sq(1.0-x[i]);'
' }'
' if (*ind==3||*ind==4) {'
' g[0]=-4.0*p*(x[1]-sq(x[0]))*x[0];'
' for (i=1;i<*n-1;i++)'
' g[i]=2.0*p*(x[i]-sq(x[i-1]))-4.0*p*(x[i+1]-sq(x[i]))*x[i]-2.0*(1.0-x[i]);'
' g[*n-1]=2.0*p*(x[*n-1]-sq(x[*n-2]))-2.0*(1.0-x[*n-1]);'
' }'
'}'];
mputl(C,TMPDIR+'/rosenc.c')
// compile the C code
l=ilib_for_link('rosenc','rosenc.o',[],'c',TMPDIR+'/Makefile');
// incremental linking
link(l,'rosenc','c')
//solve the problem
x0=[40;10;50];
p=100;
[f,xo,go]=optim('rosenc',x0,'td',p)
]]>Example #3 : Fortran functionThe following is an example with a Fortran function.Example #4 : Fortran function with initializationThe following is an example with a Fortran function in which the
"in" option is used to allocate memory inside the Scilab environment. In
this mode, there is a dialog between Scilab and the objective function.
The goal of this dialog is to initialize the parameters of the objective
function. Each part of this dialog is based on a specific value of the
"ind" parameter.At the beginning, Scilab calls the objective function, with the ind
parameter equals to 10. This tells the objective function to initialize
the sizes of the arrays it needs by setting the nizs, nrzs and ndzs
integer parameters of the "nird" common. Then the objective function
returns. At this point, Scilab creates internal variables and allocate
memory for the variable izs, rzs and dzs. Scilab calls the objective
function back again, this time with ind equals to 11. This tells the
objective function to initialize the arrays izs, rzs and dzs. When the
objective function has done so, it returns. Then Scilab enters in the real
optimization mode and calls the optimization solver the user requested.
Whenever the objective function is called, the izs, rzs and dzs arrays
have the values that have been previously initialized.Example #5 : Fortran function with initialization on Windows with
Intel Fortran CompilerUnder the Windows operating system with Intel Fortran Compiler, one
must carefully design the fortran source code so that the dynamic link
works properly. On Scilab's side, the optimization component is
dynamically linked and the symbol "nird" is exported out of the
optimization dll. On the cost function's side, which is also dynamically
linked, the "nird" common must be imported in the cost function
dll.The following example is a re-writing of the previous example, with
special attention for the Windows operating system with Intel Fortran
compiler as example. In that case, we introduce additionnal compiling
instructions, which allows the compiler to import the "nird"
symbol.Example #6 : Logging featuresThe imp flag may take negative integer values, say k.
In that case, the cost function is called once every -k iterations.
This allows to draw the function value or write a log file.
In the following example, we solve the Rosenbrock test case.
For each iteration of the algorithm, we print the value of x, f and g.
xref=[1;2;3];
x0=[1;-1;1];
function [f,g,ind] = cost(x,ind)
f=0.5*norm(x-xref)^2;
g=x-xref;
endfunction
function [f,g,ind] = cost(x,ind)
if ind == 2 | ind == 4 then
f=0.5*norm(x-xref)^2;
end
if ind == 3 | ind == 4 then
g=x-xref;
end
if ind == 1 then
mprintf("x = %s\n", strcat(string(x)," "))
mprintf("f = %e\n", f)
g=x-xref;
mprintf("g = %s\n", strcat(string(g)," "))
end
endfunction
[f,xopt]=optim(cost,x0,imp=-1)
In the following example, we solve the Rosenbrock test case.
For each iteration of the algorithm, we plot the current value of x
into a 2D graph containing the contours of Rosenbrock's function.
This allows to see the progress of the algorithm while the
algorithm is performing. We could as well write the
value of x, f and g into a log file if needed.
// 1. Define rosenbrock
function [ f , g , ind ] = rosenbrock ( x , ind )
if ((ind == 1) | (ind == 2) | (ind == 4)) then
f = 100.0 *(x(2)-x(1)^2)^2 + (1-x(1))^2;
end
if ((ind == 1) | (ind == 3) | (ind == 4)) then
g(1) = - 400. * ( x(2) - x(1)**2 ) * x(1) -2. * ( 1. - x(1) )
g(2) = 200. * ( x(2) - x(1)**2 )
end
if (ind == 1) then
plot ( x(1) , x(2) , "g." )
end
endfunction
x0 = [-1.2 1.0];
xopt = [1.0 1.0];
// 2. Draw the contour of Rosenbrock's function
xmin = -2.0
xmax = 2.0
stepx = 0.1
ymin = -1.0
ymax = 2.0
stepy = 0.1
nx = 100
ny = 100
stepx = (xmax - xmin)/nx;
xdata = xmin:stepx:xmax;
stepy = (ymax - ymin)/ny;
ydata = ymin:stepy:ymax;
for ix = 1:length(xdata)
for iy = 1:length(ydata)
x = [xdata(ix) ydata(iy)];
f = rosenbrock ( x , 2 );
zdata ( ix , iy ) = f;
end
end
contour ( xdata , ydata , zdata , [1 10 100 500 1000])
plot(x0(1) , x0(2) , "b.")
plot(xopt(1) , xopt(2) , "r*")
// 3. Plot the optimization process, during optimization
[ fopt , xopt ] = optim ( rosenbrock , x0 , imp = -1)
Example #7 : Optimizing without derivativesIt is possible to optimize a problem without an explicit
knowledge of the derivative of the cost function.
For this purpose, we can use the numdiff or derivative function
to compute a numerical derivative of the cost function.
In the following example, we use the numdiff function to
solve Rosenbrock's problem.
function f = rosenbrock ( x )
f = 100.0 *(x(2)-x(1)^2)^2 + (1-x(1))^2;
endfunction
function [ f , g , ind ] = rosenbrockCost ( x , ind )
if ((ind == 1) | (ind == 2) | (ind == 4)) then
f = rosenbrock ( x );
end
if ((ind == 1) | (ind == 3) | (ind == 4)) then
g= numdiff ( rosenbrock , x );
end
endfunction
x0 = [-1.2 1.0];
[ fopt , xopt ] = optim ( rosenbrockCost , x0 )
In the following example, we use the derivative function to
solve Rosenbrock's problem. Given that the step computation strategy
is not the same in numdiff and derivative, this might lead to improved
results.
function f = rosenbrock ( x )
f = 100.0 *(x(2)-x(1)^2)^2 + (1-x(1))^2;
endfunction
function [ f , g , ind ] = rosenbrockCost2 ( x , ind )
if ((ind == 1) | (ind == 2) | (ind == 4)) then
f = rosenbrock ( x );
end
if ((ind == 1) | (ind == 3) | (ind == 4)) then
g= derivative ( rosenbrock , x.' , order = 4 );
end
endfunction
x0 = [-1.2 1.0];
[ fopt , xopt ] = optim ( rosenbrockCost2 , x0 )
See AlsoexternalqpsolvedatafitleastsqnumdiffderivativeNDcostReferencesThe following is a map from the various options to the underlying
solvers, with some comments about the algorithm, when available."qn" without constraintsn1qn1 : a quasi-Newton method with a Wolfe-type line
search"qn" with bounds constraintsqnbd : a quasi-Newton method with projectionRR-0242 - A variant of a projected variable metric method for
bound constrained optimization problems, Bonnans Frederic, Rapport
de recherche de l'INRIA - Rocquencourt, Octobre 1983"gc" without constraintsn1qn3 : a conjugate gradient method with BFGS."gc" with bounds constraintsgcbd : a BFGS-type method with limited memory and
projection"nd" without constraintsn1fc1 : a bundle method"nd" with bounds constraintsnot availableAuthorThe Modulopt library : J.Frederic Bonnans, Jean-Charles Gilbert, Claude LemarechalThe interfaces to the Modulopt library : J.Frederic BonnansThis help : Michael Baudin