compute the Gradient of a Maple procedure - Maple Help

Online Help

All Products    Maple    MapleSim


Home : Support : Online Help : Programming : codegen Package : codegen/GRADIENT

codegen[GRADIENT] - compute the Gradient of a Maple procedure

Calling Sequence

GRADIENT(F)

GRADIENT(F, X)

Parameters

F

-

Maple procedure

X

-

list of symbols (parameters of F)

Description

• 

The first argument F is a Maple procedure which computes a function of x1,x2,...,xn.  The GRADIENT command outputs a new procedure G, which when executed at given values for x1,x2,...,xn, returns a vector of the partial derivatives of F w.r.t. x1,...,xn at the given values.  This is known as automatic differentiation.  It often leads to a more efficient computation than symbolic differentiation, that is, what you would obtain from using linalg[grad]. For example, given

F := proc(x,y) local t;

     t := exp(-x);

     y*t+t

end proc;

  

The output of G := GRADIENT(F); is the Maple procedure

G := proc(x, y) local df, t;

     t := exp(-x);

     df := array(1 .. 1);

     df[1] := y + 1;

     return -df[1]*exp(-x), t

  end proc

  

When G is called with arguments 1.0,1.0, the output is the vector 0.7357588824,0.3678794412. G can subsequently be optimized by optimize(G) so that the exponential is computed only once, to obtain

proc(x, y)

local df, t;

     t := exp(-x);

     df := array(1 .. 1);

     df[1] := y + 1;

     return -df[1]*t, t

end proc

  

One can obtain derivatives of any function which can be represented by a computation constructed by composition of arithmetic operators and any mathematical functions, including functions containing loops and making subroutine calls.

  

The remaining arguments to GRADIENT are optional, they are described in the following. There are a number of limitations on the Maple procedure F that the GRADIENT command can handle.  These are also discussed below.

• 

By default, GRADIENT computes the partial derivatives of F w.r.t.  all formal parameters present in F. The optional argument X, a list of symbols, may be used to specify which formal parameters to take the derivative w.r.t. For example, the call GRADIENT(F,[x]); computes the partial derivative of F with respect to the formal parameter x only. The result is

proc(x, y)

local df, t;

    t := exp(-x);

    df := array(1 .. 1);

    df[1] := y + 1;

    return -df[1]*exp(-x)

end proc

  

The procedure F may not refer to it's parameters using args[i]. For example, you may not define F as follows

  

 

F := proc() local t;

     t := exp(-args[1]);

     args[2]*t+t

end proc;

  

Nor is it possible to differentiate w.r.t. an array of parameters. For example, you cannot presently define F as

  

 

F := proc(x::array(1..2)) local t;

     t := exp(-x[1]);

     x[2]*t+t

end proc;

  

and compute the gradient w.r.t. the array x.

• 

Two algorithms are supported, the so-called forward and reverse modes. By default, GRADIENT tries to use the reverse mode since it usually leads to a more efficient code.  If it is unable to use the reverse mode, the forward mode is used.  The user may specify which algorithm is to be used by giving the optional argument mode=forward or mode=reverse.

• 

The advantage of the reverse mode is that the cost of computing the gradient is usually cheaper than the forward mode, especially if the number of partial derivatives n is high.  It is also cheaper than symbolic differentiation (Maple's diff command) and divided differences.  Specifically, if the cost of computing F is m arithmetic operations, then the cost of computing the Gradient of F using the forward and reverse modes is O(m n) and O(m+n) respectively.  However, the best result is usually obtained by pre-optimizing the input procedure F and post-optimizing the output procedure of Gradient. Also, before applying GRADIENT it is best to split up long products in F.  This can be done with the codegen[split] command.

• 

The vector of partial derivatives is, by default, returned as a sequence. The optional argument result_type=list, result_type=array, or result_type=seq specifies that the vector of derivatives returned by G is to be a Maple list, array, and sequence respectively. For example, the call GRADIENT(F,result_type=array); yields

proc(x, y)

local df, grd, t;

    t := exp(-x);

    df := array(1 .. 1);

    df[1] := y + 1;

    grd := array(1 .. 2);

    grd[1] := - df[1]*exp(-x);

    grd[2] := t;

    return grd

end proc

• 

The optional argument function_value=true causes GRADIENT to compute the function value Fx1,...,xn at the same time as the Gradient. It is returned as the first value in the vector.  For example the call GRADIENT(F,function_value=true); yields

proc(x, y)

local df, t;

    t := exp(-x);

    df := array(1 .. 1);

    df[1] := y + 1;

    return y*t + t, - df[1]*exp(-x), t

end proc

• 

See also codegen[HESSIAN] and codegen[JACOBIAN] for routines for computing Hessians and Jacobians using automatic differentiation of functions represented by Maple procedures. Also, the derivative operator D in Maple can be used to compute a derivative of a Maple procedure in one variable.

• 

The command with(codegen,GRADIENT) allows the use of the abbreviated form of this command.

Examples

withcodegen:

F := proc(x,y) local t; t := x*y; x+t-y*t; end proc;

F:=procx,ylocalt;t:=x*y;x+ty*tend proc

(1)

Fx,y

xy2+xy+x

(2)

G:=GRADIENTF

G:=procx,ylocaldf,t;t:=x*y;df:=array1..1;df[1]:=−y+1;returny*df[1]+1,x*df[1]tend proc

(3)

Gx,y

yy+1+1,xy+1xy

(4)

G:=GRADIENTF,mode=forward

G:=procx,ylocaldt,t;dt:=array1..2;dt[1]:=y;dt[2]:=x;t:=x*y;returndt[1]*−y+1+1,dt[2]*−y+1tend proc

(5)

Gx,y

yy+1+1,xy+1xy

(6)

G:=GRADIENTF,function_value=true

G:=procx,ylocaldf,t;t:=x*y;df:=array1..1;df[1]:=−y+1;return−t*y+t+x,y*df[1]+1,x*df[1]tend proc

(7)

Gx,y

xy2+xy+x,yy+1+1,xy+1xy

(8)

G:=GRADIENTF,result_type=array

G:=procx,ylocaldf,grd,t;t:=x*y;df:=array1..1;df[1]:=−y+1;grd:=array1..2;grd[1]:=y*df[1]+1;grd[2]:=x*df[1]t;returngrdend proc

(9)

G:=GRADIENTF,result_type=list

G:=procx,ylocaldf,t;t:=x*y;df:=array1..1;df[1]:=−y+1;returny*df[1]+1,x*df[1]tend proc

(10)

H:=GRADIENTG,result_type=list

H:=procx,ylocaldf,df1,dfr0,t;t:=x*y;df1:=−y+1;df:=array1..2;dfr0:=array1..2;df[2]:=y;dfr0[2]:=x;dfr0[1]:=−1;return0,df1df[2],y*dfr0[1]+df1,x*dfr0[1]dfr0[2]end proc

(11)

optimizeH

procx,ylocalt1;t1:=−2*y+1;0,t1,t1,−2*xend proc

(12)

This next example illustrates pre and post optimization. The gradient is computed with respect to phi and omega only. Since the torus procedure returns three values, the Gradient is really the Jacobian matrix of partial derivatives.

torus  := proc(phi,omega,R,r) local x,y,z;
     x := cos(phi)*(R+r*cos(omega));
     y := sin(phi)*(R+r*cos(omega));
     z := r*sin(omega);
     [x,y,z]
end proc:

torus:=optimizetorus

torus:=procphi,omega,R,rlocalt1,t2,t4,t5,t6;t1:=cosphi;t2:=cosomega;t4:=r*t2+R;t5:=sinphi;t6:=sinomega;t1*t4,t5*t4,r*t6end proc

(13)

G:=GRADIENTtorus,φ,ω,result_type=list

G:=procphi,omega,R,rlocaldf,dfr0,dfr1,t1,t2,t4,t5,t6;t1:=cosphi;t2:=cosomega;t4:=r*t2+R;t5:=sinphi;t6:=sinomega;df:=array1..5;dfr0:=array1..5;dfr1:=array1..5;df[3]:=t1;df[2]:=df[3]*r;df[1]:=t4;dfr0[4]:=t4;dfr0[3]:=t5;dfr0[2]:=dfr0[3]*r;dfr1[5]:=r;return−df[1]*sinphi,−df[2]*sinomega,dfr0[4]*cosphi,−dfr0[2]*sinomega,0,dfr1[5]*cosomegaend proc

(14)

optimizeG

procphi,omega,R,rlocaldf,dfr0,t1,t2,t3,t4,t5,t6;t1:=cosphi;t2:=cosomega;t3:=r*t2;t4:=t3+R;t5:=sinphi;t6:=sinomega;df:=array1..5;dfr0:=array1..5;df[2]:=t1*r;dfr0[2]:=t5*r;−t5*t4,−df[2]*t6,t1*t4,−dfr0[2]*t6,0,t3end proc

(15)

We redo the same example but this time using the makeproc command to first build the torus procedure, returning an array instead of a list.

A:=arraycosφR+rcosω,sinφR+rcosω,rsinω:

torus := optimize( makeproc(A,[phi,omega,R,r]) );

torus:=procphi,omega,R,rlocalA,t1,t2,t4,t5,t6;A:=array1..3;t1:=cosphi;t2:=cosomega;t4:=t2*r+R;A[1]:=t1*t4;t5:=sinphi;A[2]:=t5*t4;t6:=sinomega;A[3]:=r*t6;Aend proc

(16)

G:=optimizeGRADIENTtorus,φ,ω,result_type=array

G:=procphi,omega,R,rlocaldf,dfr0,grd,t1,t2,t4,t5,t6;t1:=cosphi;t2:=cosomega;t4:=r*t2+R;t5:=sinphi;t6:=sinomega;df:=array1..8;dfr0:=array1..8;df[2]:=r*t1;dfr0[2]:=t5*r;grd:=array1..3,1..2;grd[1,1]:=−t5*t4;grd[1,2]:=−df[2]*t6;grd[2,1]:=t1*t4;grd[2,2]:=−dfr0[2]*t6;grd[3,1]:=0;grd[3,2]:=r*t2;grdend proc

(17)

This example shows that local procedures are handled.

f := proc(x) local s, t;
  s := proc(x) local e; e := exp(x); (e-1/e)/2 end proc;
  t := s(x^2);
  1-x*t+x^2*t;
end proc:

GRADIENTf

procxlocaldfr0,ds,lf1,s,t;ds:=procxlocaldf,e;e:=expx;df:=array1..1;df[1]:=1/2+1/2*e^2;returndf[1]*expxend proc;s:=procxlocale;e:=expx;1/2*e1/2*eend proc;t:=sx^2;dfr0:=array1..1;lf1:=dsx^2;dfr0[1]:=x^2x;return2*x*dfr0[1]*lf1[1]+2*t*xtend proc

(18)

This final example illustrates the need for breaking up large products.

F := proc(u,v,w,x,y,z) u*v*w*x*y*z end proc;

F:=procu,v,w,x,y,zu*v*w*x*y*zend proc

(19)

G:=GRADIENTF

G:=procu,v,w,x,y,zreturnv*w*x*y*z,u*w*x*y*z,u*v*x*y*z,u*v*w*y*z,u*v*w*x*z,u*v*w*x*yend proc

(20)

costG

24multiplications

(21)

F:=splitF

F:=procu,v,w,x,y,zlocals0,s1,s2,s3;s0:=u*v;s1:=w*x;s2:=y*z;s3:=s0*s1;returns3*s2end proc

(22)

G:=GRADIENTF

G:=procu,v,w,x,y,zlocaldf,s0,s1,s2,s3;s0:=u*v;s1:=w*x;s2:=y*z;s3:=s0*s1;df:=array1..4;df[4]:=s2;df[3]:=s3;df[2]:=df[4]*s0;df[1]:=df[4]*s1;returnv*df[1],u*df[1],df[2]*x,df[2]*w,df[3]*z,df[3]*yend proc

(23)

costG

8storage+9assignments+12multiplications+12subscripts

(24)

See Also

codegen[cost], codegen[HESSIAN], codegen[JACOBIAN], codegen[makeproc], codegen[optimize], codegen[split]


Download Help Document

Was this information helpful?



Please add your Comment (Optional)
E-mail Address (Optional)
What is ? This question helps us to combat spam