> restart;
Calculus and Analytic Geometry III, Kawski, Spring 1996
The chain rule:
1. Lifting a curve to the graph of a function:
2. In general, matrix multiplication
>
1111111111111111111111111111111111111111111111111111111111111111
The following example may help clarify the various derivatives involved.
Composing a function z=f(x,y) with a parameterixed curve (x,y)=(g(t),h(t))
yields a function z=F(t).
We use pure function notation.
One commonly uses x=t, which often leads to confusion. In the picture it
helps as we only have three axes to view!
> g:=t->t;
> h:=t->1-t^2;
> r:=[g,h];
> f:=(x,y)->1+3/(1+x^2+y^2);
> F:=t->f(g(t),h(t));
The individual plots of the four functions:
> with(plots):
Warning, the name changecoords has been redefined
> plot(g,-2..2,labels=[t,x]);
> plot(h,-2..2,labels=[t,y]);
> plot3d(f,-2..2,-3..2,labels=[`t=x`,y,z],shading=Z);
> plot(F,-2..2,labels=[t,z]);
A natural way to look at all plots combined:
For better visibility we "drop" the xy-plane to a lower level.
> surf:=plot3d(f,-2..2,-3..2):
> drop:=plot3d(0,-2..2,-3..2,color=green,grid=[5,6]):
> base:=spacecurve([g,h,.01],-2..2,color=blue,thickness=2):
> lift:=spacecurve([g,h,F*1.01],-2..2,color=magenta,thickness=2):
> display({drop,surf,base,lift},axes=frame,labels=[`t=x`,y,z],
> orientation=[56,69],shading=Z);
Go back to the last combination plot.
Change the style to WIREFRAME and rotate into the side views, and top view.
Make sure you understand what you see -- the same simple graphs as earlier!
Once more go back to same plot, change the style to PATCHCONTOUR and
look at the view from above -- where do the local maxima and minima of
z as a function of t occur -- describe the characteristic feature of these
locations in concise mathematical language -- this will be the base for the
class on Friday on Lagrange multipliers.
> display({drop,surf,base,lift},view=[-1..1,0..1.2,0..4],
> axes=frame,labels=[`t=x`,y,z],
> style=patchcontour,orientation=[-90,0],shading=Z);
Briefly, the derivatives of all functions -- note that Df has to be evaluated
at g in DF = Df@g . Dg
> recall:=x=g(t),y=h(t),z=f(x,y),z=F(t);
> veloc:=[diff(g(t),t),diff(h(t),t)];
> gradf:=[diff(f(x,y),x),diff(f(x,y),y)];
> gradft:=subs(x=g(t),y=h(t),gradf);
> oneway:=linalg[dotprod](gradft,veloc);
> otherway:=diff(F(t),t);
> simplify(oneway-otherway);
The location of the extrema along the curve z=F(t) -- it is very easy to solve
the equation F ' (t) = 0. Make sure that you know where each number in
the sequel fits into the picture!
> critpts:=solve(oneway,t);
> for i from 1 to 3 do
> t||i:=critpts[i]:
> [x||i,y||i,z||i]=[g(t||i),h(t||i),F(t||i)];
> od;
2222222222222222222222222222222222222222222222222222222222222222
In general, when composing functions of several variables, the total
derivatives are matrices of various dimensions. Practically speaking,
the chain-rule then boils down to matrix multiplication.
Note, that the sizes of the matrices are automatically of the right
sizes for multiplication.
Example: Consider a parameterized curve (u,v)=g(t), and a parameterized
surface (x,y,z)=f(u,v). Composing these two, we obtain a parameterized
curve in 3-space (x,y,z)=F(t)=f(g(t)).
In this case the various matrices have the following sizes:
> with(linalg):
Warning, the protected names norm and trace have been redefined and unprotected
> Df:=jacobian([x(u,v),y(u,v),z(u,v)],[u,v]);
> Dg:=jacobian([u(t),v(t)],[t]);
This is the velocity vector in the (u,v)-parameter plane.
> DfDg:=multiply(Df,Dg);
> DF:=jacobian([x(t),y(t),z(t)],[t]);
This is the velocity vector in 3space!
In this case the chain rule says: DF (t) = (Df)(g(t)) . (Dg)(t), or using functions
D(f@g) = (Df)@g . (Dg)
Writing out the components the rules look as follows:
> for i to 3 do DF[i,1]=DfDg[i,1] od;
Feel free to take a look what the matrices look like in other cases.
Simply edit the following statements, e.g. replace t -> (u,v) -> (x,y,z)
by t -> (u,v,w) -> (x,y,z) -> (theta, phi, rho).
To make it a better learning experience, FIRST predict what size matrices
you will get (and what the components will be). THEN execute the commands,
and compare!
> Df:=jacobian([r(x,y,z),theta(x,y,z),z],[x,y,z]);
> Dg:=jacobian([x(u,v,w),y(u,v,w),z(u,v,w)],[u,v,w]);
> DF:=jacobian([r(u,v,w),theta(u,v,w),z(u,v,w)],[u,v,w]);
The following is a fancy shorthand notation to print out all single
equations corresponding to the matrix chain-rule for this case:
> for alpha in indices(DF) do DF[op(alpha)]=DfDg[op(alpha)] od;