1 <?xml version="1.0" encoding="UTF-8"?>
3 * Scilab ( http://www.scilab.org/ ) - This file is part of Scilab
4 * Copyright (C) 2008 - INRIA
5 * Copyright (C) 2008 - 2009 - INRIA - Michael Baudin
6 * Copyright (C) 2010 - 2011 - DIGITEO - Michael Baudin
8 * This file must be used under the terms of the CeCILL.
9 * This source file is licensed as described in the file COPYING, which
10 * you should have received as part of this distribution. The terms
11 * are also available at
12 * http://www.cecill.info/licences/Licence_CeCILL_V2-en.txt
15 <refentry xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svg="http://www.w3.org/2000/svg" xmlns:ns3="http://www.w3.org/1999/xhtml" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:db="http://docbook.org/ns/docbook" xmlns:scilab="http://www.scilab.org" xml:id="optim" xml:lang="en">
17 <refname>optim</refname>
18 <refpurpose>non-linear optimization routine</refpurpose>
21 <title>Calling Sequence</title>
23 fopt = optim(costf, x0)
24 fopt = optim(costf [,<contr>],x0 [,algo] [,df0 [,mem]] [,work] [,<stop>] [,<params>] [,imp=iflag])
25 [fopt, xopt] = optim(...)
26 [fopt, xopt, gopt] = optim(...)
27 [fopt, xopt, gopt, work] = optim(...)
31 <title>Arguments</title>
36 <para>a function, a list or a string, the objective function.</para>
42 <para>real vector, the initial guess for
48 <term><contr></term>
50 <para>an optional sequence of arguments containing the lower and
51 upper bounds on <literal>x</literal>. If bounds are required, this
52 sequence of arguments must be <literal>"b",binf,bsup</literal> where
53 <literal>binf</literal> and <literal>bsup</literal> are real vectors
54 with same dimension as <literal>x0</literal>.
61 <para>a string, the algorithm to use (default
62 <literal>algo="qn"</literal>).
64 <para>The available algorithms are:</para>
68 <literal>"qn"</literal>: Quasi-Newton with BFGS
73 <literal>"gc"</literal>: limited memory BFGS
78 <literal>"nd"</literal>: non-differentiable.
81 The <literal>"nd"</literal> algorithm does not accept
82 bounds on <literal>x</literal>.
92 real scalar, a guess of the decreasing of <literal>f</literal>
93 at first iteration. (default <literal>df0=1</literal>).
100 <para>integer, the number of variables used to approximate the
101 Hessian (default <literal>mem=10</literal>). This feature is
102 available for the <literal>"gc"</literal> algorithm without
103 constraints and the non-smooth algorithm <literal>"nd"</literal>
109 <term><stop></term>
111 <para>a sequence of arguments containing the parameters controlling
112 the convergence of the algorithm. The following sequences are
113 available: <screen> "ar",nap
116 "ar",nap,iter,epsg,epsf
117 "ar",nap,iter,epsg,epsf,epsx
126 maximum number of calls to <literal>costf</literal>
127 allowed (default <literal>nap=100</literal>).
134 <para>maximum number of iterations allowed (default
135 <literal>iter=100</literal>).
142 <para>threshold on gradient norm (default
143 <literal>epsg= %eps</literal>).
151 threshold controlling decreasing of <literal>f</literal>
152 (default <literal>epsf=0</literal>).
160 threshold controlling variation of <literal>x</literal>
161 (default <literal>epsx=0</literal>). This vector (possibly
162 matrix) with same size as <literal>x0</literal> can be used to
163 scale <literal>x</literal>.
171 <term><params></term>
173 <para>in the case where the objective function is a C or Fortran
174 routine, a sequence of arguments containing the method to
175 communicate with the objective function. This option has no meaning
176 when the cost function is a Scilab script.
178 <para>The available values for <params> are the
184 <literal>"in"</literal>
186 <para>That mode allows to allocate memory in the internal Scilab
187 workspace so that the objective function can get arrays with the
188 required size, but without directly allocating the memory. The
189 <literal>"in"</literal> value stands for "initialization". In
190 that mode, before the value and derivative of the objective
191 function is to be computed, there is a dialog between the
192 <literal>optim</literal> Scilab primitive and the objective
193 function <literal>costf</literal>. In this dialog, the objective
194 function is called two times, with particular values of the
195 <literal>ind</literal> parameter. The first time,
196 <literal>ind</literal> is set to 10 and the objective function
197 is expected to set the <literal>nizs</literal>,
198 <literal>nrzs</literal> and <literal>ndzs</literal> integer
199 parameters of the <literal>nird</literal> common, which is
202 <screen>common /nird/ nizs,nrzs,ndzs </screen>
203 <para>This allows Scilab to allocate memory inside its internal
204 workspace. The second time the objective function is called,
205 <literal>ind</literal> is set to 11 and the objective function
206 is expected to set the <literal>ti</literal>,
207 <literal>tr</literal> and <literal>tz</literal> arrays. After
208 this initialization phase, each time it is called, the objective
209 function is ensured that the <literal>ti</literal>,
210 <literal>tr</literal> and <literal>tz</literal> arrays which are
211 passed to it have the values that have been previously
217 <literal>"ti",valti</literal>
220 In this mode, <literal>valti</literal> is expected to be a
221 Scilab vector variable containing integers. Whenever the
222 objective function is called, the <literal>ti</literal> array it
223 receives contains the values of the Scilab variable.
228 <literal>"td", valtd</literal>
231 In this mode, <literal>valtd</literal> is expected to be a
232 Scilab vector variable containing double values. Whenever the
233 objective function is called, the <literal>td</literal> array it
234 receives contains the values of the Scilab variable.
239 <literal>"ti",valti,"td",valtd</literal>
241 <para>This mode combines the two previous modes.</para>
245 The <literal>ti, td</literal> arrays may be used so that the
246 objective function can be computed. For example, if the objective
247 function is a polynomial, the ti array may may be used to store the
248 coefficients of that polynomial.
250 <para>Users should choose carefully between the
251 <literal>"in"</literal> mode and the <literal>"ti"</literal> and
252 <literal>"td"</literal> mode, depending on the fact that the arrays
253 are Scilab variables or not. If the data is available as Scilab
254 variables, then the <literal>"ti", valti, "td", valtd</literal> mode
255 should be chosen. If the data is available directly from the
256 objective function, the <literal>"in"</literal> mode should be
257 chosen. Notice that there is no <literal>"tr"</literal> mode, since,
258 in Scilab, all real values are doubles.
260 <para>If neither the "in" mode, nor the "ti", "td" mode is chosen,
261 that is, if <params> is not present as an option of the optim
262 primitive, the user may should not assume that the ti,tr and td
263 arrays can be used : reading or writing the arrays may generate
264 unpredictable results.
269 <term>"imp=iflag"</term>
271 <para>named argument used to set the trace mode (default
272 <literal>imp=0</literal>, which prints no messages). If <varname>imp</varname>
273 is greater or equal to 1, more information are printed, depending on the
274 algorithm chosen. More precisely:
278 <para><literal>"qn"</literal> without constraints: from <literal>iflag=1</literal>
279 to <literal>iflag=3</literal>.</para>
282 <para><literal>iflag>=1</literal>: initial and final print,</para>
285 <para><literal>iflag>=2</literal>: one line per iteration (number of iterations,
286 number of calls to f, value of f),</para>
289 <para><literal>iflag>=3</literal>: extra information on line searches.</para>
294 <para><literal>"qn"</literal> with bounds constraints: from <literal>iflag=1</literal>
295 to <literal>iflag=4</literal>.</para>
298 <para><literal>iflag>=1</literal>: initial and final print,</para>
301 <para><literal>iflag>=2</literal>: one line per iteration (number of iterations,
302 number of calls to f, value of f),</para>
305 <para><literal>iflag>=3</literal>: extra information on line searches.</para>
310 <para><literal>"gc"</literal> without constraints: from <literal>iflag=1</literal>
311 to <literal>iflag=5</literal>.</para>
314 <para><literal>iflag>=1</literal> and <literal>iflag>=2</literal>: initial and final print,</para>
317 <para><literal>iflag=3</literal>: one line per iteration (number of iterations,
318 number of calls to f, value of f),</para>
321 <para><literal>iflag>=4</literal>: extra information on lines searches.</para>
326 <para><literal>"gc"</literal> with bounds constraints: from <literal>iflag=1</literal>
327 to <literal>iflag=3</literal>.</para>
330 <para><literal>iflag>=1</literal>: initial and final print,</para>
333 <para><literal>iflag>=2</literal>: one print per iteration,</para>
336 <para><literal>iflag=3</literal>: extra information.</para>
341 <para><literal>"nd"</literal> with bounds constraints: from <literal>iflag=1</literal>
342 to <literal>iflag=8</literal>.</para>
345 <para><literal>iflag>=1</literal>: initial and final print,</para>
348 <para><literal>iflag>=2</literal>: one print on each convergence,</para>
351 <para><literal>iflag>=3</literal>: one print per iteration,</para>
354 <para><literal>iflag>=4</literal>: line search,</para>
357 <para><literal>iflag>=5</literal>: various tolerances,</para>
360 <para><literal>iflag>=6</literal>: weight and information on the computation of direction.</para>
370 <para>the value of the objective function at the point
371 <literal>xopt</literal>
379 best value of <literal>x</literal> found.
386 <para>the gradient of the objective function at the point
387 <literal>xopt</literal>
394 <para>working array for hot restart for quasi-Newton method. This
395 array is automatically initialized by <literal>optim</literal> when
396 <literal>optim</literal> is invoked. It can be used as input
397 parameter to speed-up the calculations.
404 <title>Description</title>
405 <para>This function solves unconstrained nonlinear optimization
408 <screen>min f(x) </screen>
410 where <literal>x</literal> is a vector and <literal>f(x)</literal>
411 is a function that returns a scalar. This function can also solve bound
412 constrained nonlinear optimization problems:
415 binf <= x <= bsup
418 where <literal>binf</literal> is the lower bound and
419 <literal>bsup</literal> is the upper bound on <literal>x</literal>.
422 The <literal>costf</literal> argument can be a Scilab function, a
423 list or a string giving the name of a C or Fortran routine (see
424 "external"). This external must return the value <literal>f</literal> of
425 the cost function at the point <literal>x</literal> and the gradient
426 <literal>g</literal> of the cost function at the point
427 <literal>x</literal>.
431 <term>Scilab function case</term>
434 If <literal>costf</literal> is a Scilab function, its calling
437 <screen>[f, g, ind] = costf(x, ind) </screen>
439 where <literal>x</literal> is the current point,
440 <literal>ind</literal> is an integer flag described below,
441 <literal>f</literal> is the real value of the objective function at
442 the point <literal>x</literal> and <literal>g</literal> is a vector
443 containing the gradient of the objective function at
444 <literal>x</literal>. The variable <literal>ind</literal> is
450 <term>List case</term>
452 <para>It may happen that objective function requires extra
453 arguments. In this case, we can use the following feature. The
454 <literal>costf</literal> argument can be the list
455 <literal>(real_costf, arg1,...,argn)</literal>. In this case,
456 <literal>real_costf</literal>, the first element in the list, must
457 be a Scilab function with calling sequence: <screen> [f,g,ind]=real_costf(x,ind,arg1,...,argn) </screen>
458 The <literal>x</literal>, <literal>f</literal>,
459 <literal>g</literal>, <literal>ind</literal> arguments have the same
460 meaning as before. In this case, each time the objective function is
461 called back, the arguments <literal>arg1,...,argn</literal> are
462 automatically appended at the end of the calling sequence of
463 <literal>real_costf</literal>.
468 <term>String case</term>
471 If <literal>costf</literal> is a string, it refers to the name
472 of a C or Fortran routine which must be linked to Scilab
476 <term>Fortran case</term>
478 <para>The calling sequence of the Fortran subroutine computing
479 the objective must be:
481 <screen>subroutine costf(ind,n,x,f,g,ti,tr,td) </screen>
482 <para>with the following declarations:</para>
483 <screen>integer ind,n ti(*)
484 double precision x(n),f,g(n),td(*)
488 The argument <literal>ind</literal> is described
491 <para>If ind = 2, 3 or 4, the inputs of the routine are :
492 <literal>x, ind, n, ti, tr,td</literal>.
494 <para>If ind = 2, 3 or 4, the outputs of the routine are :
495 <literal>f</literal> and <literal>g</literal>.
502 <para>The calling sequence of the C function computing the
505 <screen>void costf(int *ind, int *n, double *x, double *f, double *g, int *ti, float *tr, double *td) </screen>
507 The argument <literal>ind</literal> is described
510 <para>The inputs and outputs of the function are the same as
520 On output, <literal>ind<0</literal> means that
521 <literal>f</literal> cannot be evaluated at <literal>x</literal> and
522 <literal>ind=0</literal> interrupts the optimization.
526 <title>Termination criteria</title>
527 <para>Each algorithm has its own termination criteria, which may use the
528 parameters given by the user, that is <literal>nap</literal>,
529 <literal>iter</literal>, <literal>epsg</literal>, <literal>epsf</literal>
530 and <literal>epsx</literal>. Not all the parameters are taken into
531 account. In the table below, we present the specific termination
532 parameters which are taken into account by each algorithm. The
533 unconstrained solver is identified by "UNC" while the bound constrained
534 solver is identified by "BND". An empty entry means that the parameter is
535 ignored by the algorithm.
538 <informaltable border="1">
548 <td>optim/"qn" UNC</td>
556 <td>optim/"qn" BND</td>
564 <td>optim/"gc" UNC</td>
572 <td>optim/"gc" BND</td>
580 <td>optim/"nd" UNC</td>
591 <title>Example: Scilab function</title>
592 <para>The following is an example with a Scilab function. Notice, for
593 simplifications reasons, the Scilab function "cost" of the following
594 example computes the objective function f and its derivative no matter of
595 the value of ind. This allows to keep the example simple. In practical
596 situations though, the computation of "f" and "g" may raise performances
597 issues so that a direct optimization may be to use the value of "ind" to
598 compute "f" and "g" only when needed.
600 <programlisting role="example">function [f, g, ind] = cost(x, ind)
602 f = 0.5 * norm(x - xref)^2;
608 [fopt, xopt] = optim(cost, x0)
610 // Use "gc" algorithm
611 [fopt, xopt, gopt] = optim(cost, x0, "gc")
613 // Use "nd" algorithm
614 [fopt, xopt, gopt] = optim(cost, x0, "nd")
616 // Upper and lower bounds on x
617 [fopt, xopt, gopt] = optim(cost, "b", [-1;0;2], [0.5;1;4], x0)
619 // Upper and lower bounds on x and setting up the algorithm to "gc"
620 [fopt, xopt, gopt] = optim(cost, "b", [-1; 0; 2], [0.5; 1; 4], x0, "gc")
622 // Bound on the number of call to the objective function
623 [fopt, xopt, gopt] = optim(cost, "b", [-1; 0; 2], [0.5; 1; 4], x0, "gc", "ar", 3)
625 // Set max number of call to the objective function (3)
626 // Set max number of iterations (100)
627 // Set stopping threshold on the value of f (1e-6),
628 // on the value of the norm of the gradient of the objective function (1e-6)
629 // on the improvement on the parameters x_opt (1e-6;1e-6;1e-6)
630 [fopt, xopt, gopt] = optim(cost, "b", [-1; 0; 2], [0.5; 1; 4], x0, "gc", "ar", 3, 100, 1e-6, 1e-6, [1e-3; 1e-3; 1e-3])
632 // Additionnal messages are printed in the console.
633 [fopt, xopt] = optim(cost, x0, imp = 3)
637 <title>Example: Print messages</title>
639 The <literal>imp</literal> flag may take negative integer values,
640 say k. In that case, the cost function is called once every -k iterations.
641 This allows to draw the function value or write a log file.
644 This feature is available only with the <literal>"qn"</literal>
645 algorithm without constraints.
647 <para>In the following example, we solve the Rosenbrock test case. For
648 each iteration of the algorithm, we print the value of x, f and g.
650 <programlisting role="example">function [f, g, ind] = cost(x, ind)
652 f = 0.5 * norm(x - xref)^2;
655 mprintf("f(x) = %s, |g(x)|=%s\n", string(f), string(norm(g)))
660 [fopt, xopt] = optim(cost, x0, imp = -1)
662 <para>The previous script produces the following output.</para>
663 <screen>-->[fopt, xopt] = optim(cost, x0, imp = -1)
664 f(x) = 6.5, |g(x)|=3.6055513
665 f(x) = 2.8888889, |g(x)|=2.4037009
666 f(x) = 9.861D-31, |g(x)|=1.404D-15
668 Norm of projected gradient lower than 0.0000000D+00.
676 <para>In the following example, we solve the Rosenbrock test case. For
677 each iteration of the algorithm, we plot the current value of x into a 2D
678 graph containing the contours of Rosenbrock's function. This allows to see
679 the progress of the algorithm while the algorithm is performing. We could
680 as well write the value of x, f and g into a log file if needed.
682 <programlisting role="example">// 1. Define Rosenbrock for optimization
683 function [f , g , ind] = rosenbrock (x , ind)
684 f = 100.0 *(x(2) - x(1)^2)^2 + (1 - x(1))^2;
685 g(1) = - 400. * ( x(2) - x(1)**2 ) * x(1) -2. * ( 1. - x(1) )
686 g(2) = 200. * ( x(2) - x(1)**2 )
689 // 2. Define rosenbrock for contouring
690 function f = rosenbrockC ( x1 , x2 )
693 [ f , g , ind ] = rosenbrock ( x , ind )
696 // 3. Define Rosenbrock for plotting
697 function [ f , g , ind ] = rosenbrockPlot ( x , ind )
698 [ f , g , ind ] = rosenbrock ( x , ind )
700 plot ( x(1) , x(2) , "g." )
704 // 4. Draw the contour of Rosenbrock's function
707 xdata = linspace(-2,2,100);
708 ydata = linspace(-2,2,100);
709 contour ( xdata , ydata , rosenbrockC , [1 10 100 500 1000])
710 plot(x0(1) , x0(2) , "b.")
711 plot(xopt(1) , xopt(2) , "r*")
713 // 5. Plot the optimization process, during optimization
714 [fopt, xopt] = optim ( rosenbrockPlot , x0 , imp = -1)
717 function [f, g, ind]=rosenbrock(x, ind)
718 f = 100.0 *(x(2) - x(1)^2)^2 + (1 - x(1))^2;
719 g(1) = - 400. * ( x(2) - x(1)**2 ) * x(1) -2. * ( 1. - x(1) )
720 g(2) = 200. * ( x(2) - x(1)**2 )
723 function f=rosenbrockC(x1, x2)
726 [ f , g , ind ] = rosenbrock ( x , ind )
729 function [f, g, ind]=rosenbrockPlot(x, ind)
730 [ f , g , ind ] = rosenbrock ( x , ind )
732 plot ( x(1) , x(2) , "g." )
738 xdata = linspace(-2,2,100);
739 ydata = linspace(-2,2,100);
740 contour ( xdata , ydata , rosenbrockC , [1 10 100 500 1000])
741 plot(x0(1) , x0(2) , "b.")
742 plot(xopt(1) , xopt(2) , "r*")
743 [fopt, xopt] = optim ( rosenbrockPlot , x0 , imp = -1)
748 <title>Example: Optimizing with numerical derivatives</title>
749 <para>It is possible to optimize a problem without an explicit knowledge
750 of the derivative of the cost function. For this purpose, we can use the
751 numdiff or derivative function to compute a numerical derivative of the
754 <para>In the following example, we use the numdiff function to solve
755 Rosenbrock's problem.
757 <programlisting role="example">function f = rosenbrock ( x )
758 f = 100.0 *(x(2)-x(1)^2)^2 + (1-x(1))^2;
761 function [ f , g , ind ] = rosenbrockCost ( x , ind )
762 f = rosenbrock ( x );
763 g= numdiff ( rosenbrock , x );
768 [ fopt , xopt ] = optim ( rosenbrockCost , x0 )
770 <para>In the following example, we use the derivative function to solve
771 Rosenbrock's problem. Given that the step computation strategy is not the
772 same in numdiff and derivative, this might lead to improved
775 <programlisting role="example">function f = rosenbrock ( x )
776 f = 100.0 *(x(2)-x(1)^2)^2 + (1-x(1))^2;
779 function [ f , g , ind ] = rosenbrockCost2 ( x , ind )
780 f = rosenbrock ( x );
781 g = derivative ( rosenbrock , x.' , order = 4 );
785 [fopt , xopt] = optim ( rosenbrockCost2 , x0 )
789 <title>Example: Counting function evaluations and number of
793 The <literal>imp</literal> option can take negative values. If the
794 <literal>imp</literal> is equal to <literal>m</literal> where
795 <literal>m</literal> is a negative integer, then the cost function is
796 evaluated every -<literal>m</literal> iterations, with the
797 <literal>ind</literal> input argument equal to 1. The following example
798 uses this feature to compute the number of iterations. The global variable
799 <literal>mydata</literal> is used to store the number of function
800 evaluations as well as the number of iterations.
802 <programlisting role="example">function [f, g, ind] = cost(x, ind,xref)
805 _MYDATA_.niter = _MYDATA_.niter + 1
807 _MYDATA_.nfevals = _MYDATA_.nfevals + 1
808 f = 0.5 * norm(x - xref)^2;
814 _MYDATA_ = tlist ( ["T_MYDATA", "niter", "nfevals"])
817 [f, xopt] = optim(list(cost, xref), x0, imp = -1)
818 mprintf("Number of function evaluations:%d\n", _MYDATA_.nfevals)
819 mprintf("Number of iterations:%d\n", _MYDATA_.niter)
821 <para>While the previous example perfectly works, there is a risk that the
822 same variable <literal>_MYDATA_</literal> is used by some internal
823 function used by <literal>optim</literal>. In this case, the value may be
824 wrong. This is why a sufficiently weird variable name has been
829 <title>Example : Passing extra parameters</title>
830 <para>In most practical situations, the cost function depends on extra
831 parameters which are required to evaluate the cost function. There are
832 several methods to achieve this goal.
834 <para>In the following example, the cost function uses 4 parameters
835 <literal>a, b, c</literal> and <literal>d</literal>. We define the cost
836 function with additionnal input arguments, which are declared after the
837 index argument. Then we pass a list as the first input argument of the
838 <literal>optim</literal> solver. The first element of the list is the cost
839 function. The additionnal variables are directly passed to the cost
842 <programlisting role="example">function [ f , g , ind ] = costfunction ( x , ind , a , b , c , d )
843 f = a * ( x(1) - c ) ^2 + b * ( x(2) - d )^2
844 g(1) = 2 * a * ( x(1) - c )
845 g(2) = 2 * b * ( x(2) - d )
853 costf = list ( costfunction , a , b , c, d );
854 [fopt , xopt] = optim ( costf , x0 , imp = 2)
856 <para>In complex cases, the cost function may have so many parameters,
857 that having a function which takes all arguments as inputs is not
858 convenient. For example, consider the situation where the cost function
859 needs 12 parameters. Then, designing a function with 14 input arguments
860 (x, index and the 12 parameters) is difficult to manage. Instead, we can
861 use a more complex data structure to store our data. In the following
862 example, we use a tlist to store the 4 input arguments. This method can
863 easily be expanded to an arbitrary number of parameters.
865 <programlisting role="example">function [f , g , ind] = costfunction ( x , ind , parameters)
866 // Get the parameters
871 f = a * ( x(1) - c ) ^2 + b * ( x(2) - d )^2
872 g(1) = 2 * a * ( x(1) - c )
873 g(2) = 2 * b * ( x(2) - d )
881 // Store the parameters
882 parameters = tlist ( [
894 costf = list ( costfunction , parameters );
895 [fopt , xopt] = optim ( costf , x0 , imp = 2)
897 <para>In the following example, the parameters are defined before the
898 optimizer is called. They are directly used in the cost function.
900 <programlisting role="example">// The example NOT to follow
901 function [ f , g , ind ] = costfunction ( x , ind )
902 f = a * ( x(1) - c ) ^2 + b * ( x(2) - d )^2
903 g(1) = 2 * a * ( x(1) - c )
904 g(2) = 2 * b * ( x(2) - d )
911 [ fopt , xopt ] = optim ( costfunction , x0 , imp = 2 )
913 <para>While the previous example perfectly works, there is a risk that the
914 same variables are used by some internal function used by
915 <literal>optim</literal>. In this case, the value of the parameters are
916 not what is expected and the optimization can fail or, worse, give a wrong
917 result. It is also difficult to manage such a function, which requires
918 that all the parameters are defined in the calling context.
920 <para>In the following example, we define the cost function with the
921 classical header. Inside the function definition, we declare that the
922 parameters <literal>a, b, c</literal> and <literal>d</literal> are global
923 variables. Then we declare and set the global variables.
925 <programlisting role="example">// Another example NOT to follow
926 function [ f , g , ind ] = costfunction ( x , ind )
928 f = a * ( x(1) - c ) ^2 + b * ( x(2) - d )^2
929 g(1) = 2 * a * ( x(1) - c )
930 g(2) = 2 * b * ( x(2) - d )
938 [ fopt , xopt ] = optim ( costfunction , x0 , imp = 2 )
940 <para>While the previous example perfectly works, there is a risk that the
941 same variables are used by some internal function used by
942 <literal>optim</literal>. In this case, the value of the parameters are
943 not what is expected and the optimization can fail or, worse, give a wrong
948 <title>Example : Checking that derivatives are correct</title>
949 <para>Many optimization problem can be avoided if the derivatives are
950 computed correctly. One common reason for failure in the step-length
951 procedure is an error in the calculation of the cost function and its
952 gradient. Incorrect calculation of derivatives is by far the most common
955 <para>In the following example, we give a false implementation of
956 Rosenbrock's gradient. In order to check the computation of the
957 derivatives, we use the <literal>derivative</literal> function. We define
958 the <literal>simplified</literal> function, which delegates the
959 computation of <literal>f</literal> to the rosenbrock function. The
960 <literal>simplified</literal> function is passed as an input argument of
961 the <literal>derivative</literal> function.
963 <programlisting role="example">function [ f , g , index ] = rosenbrock ( x , index )
964 f = 100.0 *(x(2)-x(1)^2)^2 + (1-x(1))^2;
966 g(1) = - 400. * ( x(2) - x(1)**2 ) * x(1) -2. * ( 1. - x(1) )
968 g(1) = - 1200. * ( x(2) - x(1)**2 ) * x(1) -2. * ( 1. - x(1) )
969 g(2) = 200. * ( x(2) - x(1)**2 )
972 function f = simplified ( x )
974 [ f , g , index ] = rosenbrock ( x , index )
979 [ f , g , index ] = rosenbrock ( x0 , index );
980 gnd = derivative ( simplified , x0.' );
981 mprintf("Exact derivative:[%s]\n" , strcat ( string(g) , " " ));
982 mprintf("Numerical derivative:[%s]\n" , strcat ( string(gnd) , " " ));
984 <para>The previous script produces the following output. Obviously, the
985 difference between the two gradient is enormous, which shows that the
986 wrong formula has been used in the gradient.
988 <programlisting role="example"> Exact derivative:[-638 -88]
989 Numerical derivative:[-215.6 -88]
993 <title>Example: C function</title>
994 <para>The following is an example with a C function, where a C source code
995 is written into a file, dynamically compiled and loaded into Scilab, and
996 then used by the "optim" solver. The interface of the "rosenc" function is
997 fixed, even if the arguments are not really used in the cost function.
998 This is because the underlying optimization solvers must assume that the
999 objective function has a known, constant interface. In the following
1000 example, the arrays ti and tr are not used, only the array "td" is used,
1001 as a parameter of the Rosenbrock function. Notice that the content of the
1002 arrays ti and td are the same that the content of the Scilab variable, as
1005 <programlisting role="example">// External function written in C (C compiler required)
1006 // write down the C code (Rosenbrock problem)
1007 C=['#include <math.h>'
1008 'double sq(double x)'
1010 'void rosenc(int *ind, int *n, double *x, double *f, double *g, '
1011 ' int *ti, float *tr, double *td)'
1016 ' if (*ind==2||*ind==4) {'
1018 ' for (i=1;i<*n;i++)'
1019 ' *f+=p*sq(x[i]-sq(x[i-1]))+sq(1.0-x[i]);'
1021 ' if (*ind==3||*ind==4) {'
1022 ' g[0]=-4.0*p*(x[1]-sq(x[0]))*x[0];'
1023 ' for (i=1;i<*n-1;i++)'
1024 ' g[i]=2.0*p*(x[i]-sq(x[i-1]))-4.0*p*(x[i+1]-sq(x[i]))*x[i]-2.0*(1.0-x[i]);'
1025 ' g[*n-1]=2.0*p*(x[*n-1]-sq(x[*n-2]))-2.0*(1.0-x[*n-1]);'
1029 mputl(C, TMPDIR+'/rosenc.c')
1031 // compile the C code
1032 l = ilib_for_link('rosenc', 'rosenc.c', [], 'c');
1034 // incremental linking
1035 link(l, 'rosenc', 'c')
1040 [f, xo, go] = optim('rosenc', x0, 'td', p)
1044 <title>Example: Fortran function</title>
1045 <para>The following is an example with a Fortran function.</para>
1046 <programlisting role="example">// External function written in Fortran (Fortran compiler required)
1047 // write down the Fortran code (Rosenbrock problem)
1048 F = [ ' subroutine rosenf(ind, n, x, f, g, ti, tr, td)'
1049 ' integer ind,n,ti(*)'
1050 ' double precision x(n),f,g(n),td(*)'
1053 ' double precision y,p'
1055 ' if (ind.eq.2.or.ind.eq.4) then'
1058 ' f=f+p*(x(i)-x(i-1)**2)**2+(1.0d0-x(i))**2'
1061 ' if (ind.eq.3.or.ind.eq.4) then'
1062 ' g(1)=-4.0d0*p*(x(2)-x(1)**2)*x(1)'
1065 ' g(i)=2.0d0*p*(x(i)-x(i-1)**2)-4.0d0*p*(x(i+1)-x(i)**2)*x(i)'
1066 ' & -2.0d0*(1.0d0-x(i))'
1069 ' g(n)=2.0d0*p*(x(n)-x(n-1)**2)-2.0d0*(1.0d0-x(n))'
1074 mputl(F, TMPDIR+'/rosenf.f')
1076 // compile the Fortran code
1077 l = ilib_for_link('rosenf', 'rosenf.f', [], 'f');
1079 // incremental linking
1080 link(l, 'rosenf', 'f')
1085 [f, xo, go] = optim('rosenf', x0, 'td', p)
1089 <title>Example: Fortran function with initialization</title>
1090 <para>The following is an example with a Fortran function in which the
1091 "in" option is used to allocate memory inside the Scilab environment. In
1092 this mode, there is a dialog between Scilab and the objective function.
1093 The goal of this dialog is to initialize the parameters of the objective
1094 function. Each part of this dialog is based on a specific value of the
1097 <para>At the beginning, Scilab calls the objective function, with the ind
1098 parameter equals to 10. This tells the objective function to initialize
1099 the sizes of the arrays it needs by setting the nizs, nrzs and ndzs
1100 integer parameters of the "nird" common. Then the objective function
1101 returns. At this point, Scilab creates internal variables and allocate
1102 memory for the variable izs, rzs and dzs. Scilab calls the objective
1103 function back again, this time with ind equals to 11. This tells the
1104 objective function to initialize the arrays izs, rzs and dzs. When the
1105 objective function has done so, it returns. Then Scilab enters in the real
1106 optimization mode and calls the optimization solver the user requested.
1107 Whenever the objective function is called, the izs, rzs and dzs arrays
1108 have the values that have been previously initialized.
1110 <programlisting role="example">//
1111 // Define a fortran source code and compile it (fortran compiler required)
1113 fortransource = [' subroutine rosenf(ind,n,x,f,g,izs,rzs,dzs)'
1114 'C -------------------------------------------'
1115 'c Example of cost function given by a subroutine'
1116 'c if n<=2 returns ind=0'
1117 'c f.bonnans, oct 86'
1118 ' implicit double precision (a-h,o-z)'
1120 ' double precision dzs(*)'
1121 ' dimension x(n),g(n),izs(*)'
1122 ' common/nird/nizs,nrzs,ndzs'
1127 ' if(ind.eq.10) then'
1133 ' if(ind.eq.11) then'
1139 ' if(ind.eq.2)go to 5'
1140 ' if(ind.eq.3)go to 20'
1141 ' if(ind.eq.4)go to 5'
1147 '10 f=f + dzs(2)*(x(i)-x(im1)**2)**2 + (1.0d+0-x(i))**2'
1148 ' if(ind.eq.2)return'
1149 '20 g(1)=-4.0d+0*dzs(2)*(x(2)-x(1)**2)*x(1)'
1154 ' g(i)=2.0d+0*dzs(2)*(x(i)-x(im1)**2)'
1155 '30 g(i)=g(i) -4.0d+0*dzs(2)*(x(ip1)-x(i)**2)*x(i) - '
1156 ' & 2.0d+0*(1.0d+0-x(i))'
1157 ' g(n)=2.0d+0*dzs(2)*(x(n)-x(nm1)**2) - 2.0d+0*(1.0d+0-x(n))'
1161 mputl(fortransource, TMPDIR + '/rosenf.f')
1163 // compile the C code
1164 libpath = ilib_for_link('rosenf', 'rosenf.f', [], 'f');
1166 // incremental linking
1167 linkid = link(libpath, 'rosenf', 'f');
1169 x0 = 1.2 * ones(1, 5);
1171 // Solve the problem
1173 [f, x, g] = optim('rosenf', x0, 'in');
1177 <title>Example: Fortran function with initialization on Windows with Intel
1180 <para>Under the Windows operating system with Intel Fortran Compiler, one
1181 must carefully design the fortran source code so that the dynamic link
1182 works properly. On Scilab's side, the optimization component is
1183 dynamically linked and the symbol "nird" is exported out of the
1184 optimization dll. On the cost function's side, which is also dynamically
1185 linked, the "nird" common must be imported in the cost function
1188 <para>The following example is a re-writing of the previous example, with
1189 special attention for the Windows operating system with Intel Fortran
1190 compiler as example. In that case, we introduce additionnal compiling
1191 instructions, which allows the compiler to import the "nird"
1194 <programlisting role="example">fortransource = ['subroutine rosenf(ind,n,x,f,g,izs,rzs,dzs)'
1195 'cDEC$ IF DEFINED (FORDLL)'
1196 'cDEC$ ATTRIBUTES DLLIMPORT:: /nird/'
1198 'C -------------------------------------------'
1199 'c Example of cost function given by a subroutine'
1200 'c if n<=2 returns ind=0'
1201 'c f.bonnans, oct 86'
1202 ' implicit double precision (a-h,o-z)'
1206 <refsection role="see also">
1207 <title>See Also</title>
1208 <simplelist type="inline">
1210 <link linkend="external">external</link>
1213 <link linkend="qpsolve">qpsolve</link>
1216 <link linkend="datafit">datafit</link>
1219 <link linkend="leastsq">leastsq</link>
1222 <link linkend="numdiff">numdiff</link>
1225 <link linkend="derivative">derivative</link>
1228 <link linkend="NDcost">NDcost</link>
1233 <title>References</title>
1234 <para>The following is a map from the various options to the underlying
1239 <term>"qn" without constraints</term>
1241 <para>n1qn1 : a quasi-Newton method with a Wolfe-type line
1247 <term>"qn" with bounds constraints</term>
1249 <para>qnbd : a quasi-Newton method with projection</para>
1250 <para>RR-0242 - A variant of a projected variable metric method for
1251 bound constrained optimization problems, Bonnans Frederic, Rapport
1252 de recherche de l'INRIA - Rocquencourt, Octobre 1983
1257 <term>"gc" without constraints</term>
1259 <para>n1qn3 : a Quasi-Newton limited memory method with BFGS.</para>
1263 <term>"gc" with bounds constraints</term>
1265 <para>gcbd : a BFGS-type method with limited memory and
1271 <term>"nd" without constraints</term>
1273 <para>n1fc1 : a bundle method</para>
1277 <term>"nd" with bounds constraints</term>
1279 <para>not available</para>