Application Center - Maplesoft

# The Nelder-Mead Method for Optimization in Two Dimensions

You can switch back to the summary page by clicking here.

The Nelder-Mead Method in Two Dimensions

by Greg Spradlin, Embry-Riddle Aeronautical University, USA,

NOTE: This worksheet demonstrates the use of Maple for finding a local minimum of a function of two variables, using the Nelder-Mead method.

Introduction: The Nelder-Mead Method

This worksheet illustrates the Nelder-Mead Method of unconstrained nonlinear optimization.  The Nelder-Mead method does not require the objective function f  to be differentiable.  Therefore it is well-suited to problems involving a non-differentiable objective function of a small number of decision variables.  This worksheet applies the method to any function of two variables.    The user enters the function, the initial simplex (triangle), and the desired accuracy.  The minimum value and its location are returned, along with a list of all simplices produced during the method and the actions taken, plus a plot of all the simplices.

 > restart;

The Nelder-Mead Method: a summary

The method used here is described in Rardin (see References) for an arbitrary number of decision variables and repeated here for the special case of two decision variables.  Begin with an initial simplex, or triangle, of three non-conlinear points y (1), y (2), and y (3).  Arrange them in "nonimproving" order, so if searching for a minimum, f ( y (1)) <= f ( y (2)) <= f ( y (3)) .  The idea is to move the "worst" point y (3) towards the "better" points y (1) and y (2) in the hopes of improving the objective function.

Step 1: Centroid

Let x  = ( y (1) + y (2))/2, the mean of the best two points.

Step 2: Direction

Let the away-from-worst-move direction x = x - y (3), the vector from the worst point y (3) to x .

Step 3: Attempt Reflection

Compute f ( x + x ) .   If f ( x + x ) <= f ( y (1)), that is, if x + x  is at least as good as y (1), then proceed to Step 4 : Attempt Expansion.  If f ( y (1)) < f ( x + x ) <= f ( y (2)), then replace   y (3) with  x +   x . This is called " reflection ," since the new triangle is the previous triangle reflected over the side y (1)- y (2). If f ( x + x ) > f ( y (2)), proceed to Step 5: Attempt Contraction.

Step 4: Attempt Expansion

Compute f ( x + 2 x ).  If f ( x + 2 x ) <=  f ( x + x ), that is, if x +  2 x  is at least as good as x + x  , then replace y (3) with  x+  2 x .  This is called " expansion ", for the new simplex (triangle) is larger than the previous one.  If, however, f ( x + 2 x ) >  f ( x + x ), that is, x +  2 x  is no better than x + x  , replace y (3) with  x +   x . This is called " reflection ," since the new triangle is the previous triangle reflected over the side y (1)- y (2).

Step 5: Attempt Contraction

If f ( x + x /2) < f ( y (3)), that is, x + x /2 is better than the current worst y (3), replace y (3) with  x +   x/ 2.  Following Lagarias et al (see References) we call this step outside contraction , because the new simplex is half the size of the previous one and outside of it.  If f ( x + x /2) >= f ( y (3)), then evaluate f ( x - x /2).  If f ( x - x /2) < f ( y (3)), then replace y (3) with  x -   x/ 2.  We call this inside contraction , because the new simplex is half the size of the previous one and inside it.  If both x + x /2 and x - x /2 are worse than current worst y (3), that is, f ( x + x /2) > f ( y (3)) and

f ( x - x /2) > f ( y (3)) ,  then shrink  the simplex toward y (1) as described below.

Step 5: Shrinking

Shrink the simplex toward the best point y (1) by replacing y (2) with ( y (1) + y (2))/2 and

y (3) with ( y (1) + y (3))/2.  All three sides of the triangle have been halved in length.

Termination:

After Reflection, Expansion, Contraction, or Shrinking has been performed, reorder the y 's if necessary so f ( y (1)) <= f ( y (2)) <= f ( y (3)) .  Repeat the process until   f ( y (1)),   f ( y (2)), and   f ( y (3)) are all within a predetermined tolerance from   f ( x + x ) (described in Step 3).  Report the smallest of   f ( y (1)),   f ( y (2)),   f ( y (3)) and f ( x ) as the minimum and report its location.  Note that there are five ways to get from one simplex to the next: reflection, expansion, outside contraction, inside contraction, and shrinking.

The Algorithm

Enter, plot objective function, viewing window

 > f:=(x,y)->evalf(abs(sin(x)-y^3+1)+x^2+y^4/10);  #function of 2 variables to be minimized

 > xmin:= -2; xmax:= 2; ymin:= -2; ymax:= 2;

 > with(plots):

```Warning, the name changecoords has been redefined

```

 > plot3d(f(x,y),x=xmin..xmax,y=ymin..ymax,axes=normal,labels=[x,y,z]);

 > contourplot(f(x,y),x=xmin..xmax,y=ymin..ymax,scaling=constrained,contours=50, numpoints=5000);

Enter initial simplex, stopping tolerance

 > y1:= [1.5, 0]; y2:= [2, 0]; y3:=[2,0.5]; #define initial simplex.  Need not be ordered

 > epsilon := 1.0 * 10^(-5); #stopping tolerance (see method summary: Termination)

Create function acting on points

 > ff:=proc(y1)    f(y1[1],y1[2]); end:

Procedure for ordering points by objective function values

 > order3:=proc(g,y1,y2,y3)  #returns list of the 3 points in order of increasing g local x1, x2, x3,f1, f2, f3:   f1:=g(y1);f2:=g(y2);f3:=g(y3);   if (f1 < f2) then     if (f2 < f3) then       RETURN([y1, y2,y3]);     else       if (f1 < f3) then  #order is f1, f3, f2         RETURN([y1, y3, y2]);       else #order is is f3, f1, f2         RETURN([y3, y1, y2]);       end if:    end if: else    if (f1 < f3) then #order is f2, f1, f3      RETURN([y2, y1, y3]);    else      if (f3 < f2) then #order is f3, f2, f1        RETURN([y3, y2, y1]);      else #order is f2, f3, f1        RETURN([y2, y3, y1]);      end if:    end if: end if: end: #end procedure

Create first simplex

 > xlist:=order3(ff,y1, y2, y3): x1:=xlist[1]: x2:=xlist[2]: x3:=xlist[3]:

Procedure for deciding which action to perform on simplex

 > nm:=proc(ff,y1,y2,y3,ep)   #takes func., ordered simplex, tol., outputs lambda only (or 'shrink', or 'STOP')   local x, deltax, fy1, fy2, fy3, fminus, fzero, fplus, fone, ftwo:   ######   x := (y1+y2)/2.0:     deltax := x - y3:   fy1 := ff(y1):   fy2 := ff(y2):   fy3 := ff(y3):   fminus := ff(x + (-1/2)*deltax):   fzero := ff(x):   fplus := ff(x + (1/2)*deltax):   fone := ff(x + deltax):   ftwo := ff(x + 2*deltax):   ######   if (max(abs(fzero - fy1), abs(fzero - fy3)) < ep) then     return 'STOP':   elif (fone < fy1) then     if (ftwo < fone) then       return 2;     else       return 1;     end if:   elif (fone < fy2) then     return 1;   elif (fone < fy3) then #try contraction w/ lambda = +1/2     if (fplus < fy3) then       return 0.5:     else       return 'shrink':     end if:   else #try contraction w/lambda = -1/2     if (fminus < fy3) then       return -0.5:     else       return 'shrink':     end if:   end if: end:   #end procedure

Execution of algorithm

 > simplexlist[0]:=[x1,x2,x3]:

 > lambda:=0:

 > N:=0: #  # of iterations

 > for i from 0 while ((i < 50) and not(lambda = 'STOP')) do   N:=i:   #note the i<50 above.  Otherwise roundoff error can cause infinite loop.   lambda:=nm(ff, x1, x2, x3, epsilon):   lambdalist[i]:=lambda:   if (lambda = 'STOP') then     actionlist[i]:="STOP":     x := (x1 + x2)/2.0:     if (ff(x) < ff(x1)) then       lastpoint:=x:       minf := ff(x):     else       lastpoint:= x1:       minf := ff(x1):     end if:   elif (lambda = 'shrink') then     actionlist[i]="Shrink":     x2:=(x1+x2)/2.0:     x3:=(x1+x3)/2.0:     newsimplex := order3(ff, x1, x2, x3):     x1:=newsimplex[1]:     x2:=newsimplex[2]:     x3:=newsimplex[3]:     simplexlist[i+1]:=[x1,x2,x3]:   else     if (lambda = 1) then       actionlist[i]:="Reflect":     elif (lambda = 1/2) then       actionlist[i]:="Outside Contraction":     else       actionlist[i]:="Inside Contraction":     end if:     x := (x1 + x2)/2.0:     deltax := x - x3:     x3 := x + lambda*deltax:     newsimplex := order3(ff, x1, x2, x3):     x1:=newsimplex[1]:     x2:=newsimplex[2]:     x3:=newsimplex[3]:     simplexlist[i+1]:=[x1,x2,x3]:      end if:   end do:

Print list of simplices, actions

 > for i from 0 while (i < N) do printf("Simplex Number %d: (%f,%f),(%f,%f),(%f,%f) %s\n", i, simplexlist[i][1][1],simplexlist[i][1][2], simplexlist[i][2][1],simplexlist[i][2][2], simplexlist[i][3][1],simplexlist[i][3][2], actionlist[i]); end do:

`Simplex Number 0: (1.500000,0.000000),(2.000000,.500000),(2.000000,0.000000) Inside Contraction`

`Simplex Number 1: (1.250000,.750000),(1.500000,0.000000),(2.000000,.500000) Inside Contraction`

`Simplex Number 2: (.125000,.125000),(1.250000,.750000),(1.500000,0.000000) Reflect`

`Simplex Number 3: (-.125000,.875000),(.125000,.125000),(1.250000,.750000) Outside Contraction`

`Simplex Number 4: (-.125000,.875000),(-.625000,.375000),(.125000,.125000) Inside Contraction`

`Simplex Number 5: (-.125000,.875000),(-.625000,.375000),(-.125000,.375000) Reflect`

`Simplex Number 6: (-.125000,.875000),(-.625000,.875000),(-.625000,.375000) Inside Contraction`

`Simplex Number 7: (-.125000,.875000),(-.500000,.625000),(-.625000,.875000) Inside Contraction`

`Simplex Number 8: (-.468750,.812500),(-.125000,.875000),(-.500000,.625000) Outside Contraction`

`Simplex Number 9: (-.195313,.953125),(-.468750,.812500),(-.125000,.875000) Inside Contraction`

`Simplex Number 10: (-.195313,.953125),(-.228516,.878906),(-.468750,.812500) Reflect`

`Simplex Number 11: (.044922,1.019531),(-.195313,.953125),(-.228516,.878906) Inside Contraction`

`Simplex Number 12: (.044922,1.019531),(-.151855,.932617),(-.195313,.953125) Inside Contraction`

`Simplex Number 13: (-.124390,.964600),(.044922,1.019531),(-.151855,.932617) Inside Contraction`

`Simplex Number 14: (-.095795,.962341),(-.124390,.964600),(.044922,1.019531) Inside Contraction`

`Simplex Number 15: (-.032585,.991501),(-.095795,.962341),(-.124390,.964600) Inside Contraction`

`Simplex Number 16: (-.032585,.991501),(-.094290,.970760),(-.095795,.962341) Inside Contraction`

`Simplex Number 17: (-.079616,.971736),(-.032585,.991501),(-.094290,.970760) Reflect`

`Simplex Number 18: (-.079616,.971736),(-.017911,.992476),(-.032585,.991501) Inside Contraction`

`Simplex Number 19: (-.040674,.986804),(-.079616,.971736),(-.017911,.992476) Outside Contraction`

`Simplex Number 20: (-.081262,.972666),(-.040674,.986804),(-.079616,.971736) Inside Contraction`

`Simplex Number 21: (-.070292,.975735),(-.081262,.972666),(-.040674,.986804) Outside Contraction`

`Simplex Number 22: (-.070292,.975735),(-.093329,.967900),(-.081262,.972666) Inside Contraction`

`Simplex Number 23: (-.070292,.975735),(-.081536,.972242),(-.093329,.967900) Reflect`

`Simplex Number 24: (-.058500,.980078),(-.070292,.975735),(-.081536,.972242) Inside Contraction`

`Simplex Number 25: (-.072966,.975074),(-.058500,.980078),(-.070292,.975735) Inside Contraction`

`Simplex Number 26: (-.072966,.975074),(-.058500,.980078),(-.068013,.976656) Outside Contraction`

`Simplex Number 27: (-.072966,.975074),(-.064593,.978036),(-.058500,.980078) Inside Contraction`

`Simplex Number 28: (-.063640,.978317),(-.072966,.975074),(-.064593,.978036) Inside Contraction`

`Simplex Number 29: (-.066448,.977366),(-.063640,.978317),(-.072966,.975074) Inside Contraction`

`Simplex Number 30: (-.066448,.977366),(-.069005,.976458),(-.063640,.978317) Inside Contraction`

`Simplex Number 31: (-.065683,.977614),(-.066448,.977366),(-.069005,.976458) Outside Contraction`

`Simplex Number 32: (-.064596,.978006),(-.065683,.977614),(-.066448,.977366) Inside Contraction`

`Simplex Number 33: (-.065794,.977588),(-.064596,.978006),(-.065683,.977614) Inside Contraction`

Draw simplices

 > with(plottools):

```Warning, the name arrow has been redefined

```

 > for i from 0 while (i < N+1) do   triangles[i] := polygon([simplexlist[i][1], simplexlist[i][2],  simplexlist[i][3]], labels=[x,y],scaling=constrained): end do:

 > xcoords1:= seq(simplexlist[i][1][1], i = 0..N):  #define plotting window xcoords2:= seq(simplexlist[i][2][1], i = 0..N): xcoords3:= seq(simplexlist[i][3][1], i = 0..N): xmin:= min(min(xcoords1), min(xcoords2), min(xcoords3)): xmax:= max(max(xcoords1), max(xcoords2), max(xcoords3)): ycoords1:= seq(simplexlist[i][1][2], i = 0..N): ycoords2:= seq(simplexlist[i][2][2], i = 0..N): ycoords3:= seq(simplexlist[i][3][2], i = 0..N): ymin:= min(min(ycoords1), min(ycoords2), min(ycoords3)): ymax:= max(max(ycoords1), max(ycoords2), max(ycoords3)):

 > xplotmin:=xmin-(xmax-xmin)/10.0: xplotmax:=xmax+(xmax-xmin)/10.0: yplotmin:=ymin-(ymax-ymin)/10.0: yplotmax:=ymax+(ymax-ymin)/10.0:

 > x:='x': y:='y': #unassign x and y

 > fcontour:=contourplot(f(x,y),x=xplotmin..xplotmax,y=yplotmin..yplotmax,                              scaling=constrained,contours=100, numpoints=5000):

 > bestpointplot:= circle(lastpoint, radius, color=blue,thickness=10):

 > display(seq(triangles[i],i=0..N),fcontour,bestpointplot);

The minimum value,

 > ff(lastpoint);

occurs at

 > lastpoint;

 >

 >

References

Lagarias, J.C., Reeds, J.A., Wright, M.H., and Wright, P.E. (1998). Convergence Properties of the Nelder-Mead

Simplex Method in Low Dimensions.   SIAM J. Optim ., 9 , 112-147

Nelder, J.A. and Mead, R. (1965).  A simplex method for function optimization, Computer J ., 7 , 308-313.

Rardin, R.L. (1997).   Optimization in Operations Research ,  Prentice-Hall.

Walters, F.H., Parker, L.R., Morgan, S.L., and Deming, S.N. (1991).   Sequential Simplex Optimization , CRC Press.

Conclusion

This worksheet shows the Nelder-Mead in action and gives a list of simplices constructed, actions taken, and a plot of all simplices, automatically scaled to display the simplices.  Maple is ideal for running the algorithm due to its convenient graphics capabilities.  Running the program reveals typical behavior of the algorithm.  For example, if the initial simplex is far from a minimum, there may be several expansion steps during which the simplex moves toward the minimum point.  The shrinking step rarely occurs; rather, the simplex converges to a point via contraction.

A natural extension of this work would be to generalize the algorithm to alter the change in size of a simplex during the reflection, expansion, and contraction steps (see Lagarias et al ). Another would be to run the algorithm with more than two decision variables.  In the case of three decision variables, Maple's interactive three-dimensional graphics can display the tetrahedral simplices.

Disclaimer:  While every effort has been made to validate the solutions in this worksheet, Waterloo Maple Inc. and the contributors are not responsible for any errors contained and are not liable for any damages resulting from the use of this material.