LM Mech.mws

__Fitting a Strength Function by the Levenberg-Marquardt Method__

*By J.M. R*
*edwood*

**Synopsis**

Procedure
*mnlfit*
, based on the Levenberg-Marquardt method, is used to fit a function relating adhesive bond strength to three variables. The maximum strength and values of the variables to produce it are estimated.

**Copyright**

The procedure used in this worksheet is an adaptation of Dr. D.E. Holmgren's procedure
*mnlfit*
2000. His permission to reproduce and adapt the procedure is gratefully acknowledged.

**Procedure **
**mnlfit**

**Original Procedure**

This worksheet implements the Levenberg-Marquardt method of fitting a non-linear function to a set of data. The general form of the fitting function is:

y = f(a[0] ...a[n] ;x[0] ...x[m] )

i.e., it is possible to have more than one independent variable. The user must supply a Maple routine to compute the function
*f*
(see examples below). The procedure
*mnlfit*
will then compute the required derivatives. The user must also supply the data in the form of vectors of
*x *
and
*y *
values (or a matrix of
*x*
values in the case of more than one independent variable), as well as weights for the data points. The weights are usually the reciprocals of the standard deviations for each of the observed data points, but if not known, they may be set to unity. The routine below will iterate the solution either until the tolerance
* tol*
is reached or until the maximum number of iterations (50) is reached.

If the procedure requires a full 50 iterations, the data and formulation of the problem should be re-examined. In any case, the best way of learning how to use
*mnlfit*
is to run through the example below.

If there is sufficient interest and/or demand, a future version of this code will include automatic differentiation for more complicated model functions.

All of the usual qualifications concerning free software apply here (i.e., no claims of suitability for any particular application). Responsibility lies with the user to ensure that correct results are obtained!

Any questions about
*mnlfit*
should be addressed to D.E. Holmgren, together with any suggestions for modifications.

This procedure was developed jointly by:

D.E. Holmgren (holmgren@brandonu.ca) Brandon University, Brandon, MB, Canada

J.F. Ogilvie (ogilvie@munkebo.chem.ou.dk)

M. Monagan (monagan@cecm.sfu.ca) Center for Experimental and Constructive Mathematics, Simon Fraser University, Vancouver BC, Canada.

**Adaptations of Original Procedure**

Dr. D.E. Holmgren's original worksheet was downloaded from www.maplesoft.com/apps/categories/data_analysis_stats/data/html/genfit_6.html, and was adapted with Maple V Release 5.1. It can be used with both Maple V R5.1 and Maple 6.

The adaptations were:

(a) Removal of the redundant procedure
*fitstats*
.

(b) Removal of statements that attempted to calculate the covariance ratio,
*F*
(referred to as
*fstat*
in the original).

(c) Removal of all commands preceded by the # sign.

(d) Removal of the square root from weighting operations.

(e) Provision of means to control the output.

(f) Provision of an analysis of covariance and tests of the regression.

(g) Improved presentation of output.

Tests of the regression include G.W. Snedecor's test of the covariance ratio,
*F*
. This test is only valid for data drawn from a (multi-variable) Normal distribution. Users must satisfy themselves that this is the case, before placing any credence in this test. Naturally, the same applies to any other tests based, like Snedecor's, on Sir Ronald Fisher's
*z*
distribution and his transforms of
*z*
.

**Definitions**

*mnlfit := proc(f, X, x::array, y::vector, wt::vector, P::{name,list(name)},p::vector,tol,Prin,digi::integer)*

*f*
is the name of the function to be fitted to the data,

*X*
is the name of the independent variable appearing in this function,

*x*
is the array of observed values of
*X*
,

*y*
is the array of observed values of the dependent variable or function,

*wt*
is an array of weights (usually the inverses of the standard deviation of each value of dependent variable, but can be set to unity if weights are not known),

*P*
is the name of the parameters of
*X*
as a vectorial quantity,

*p*
is a vector of initial estimates of values of the parameters of
*X*
(rough guesses are usually good enough - the Levenberg-Marquardt method is not very sensitive to the initial estimates),

*tol*
is a tolerance that serves as a criterion of convergence of the solution,

*Prin*
controls the output of supplementary information - insert:

* S*
in place of
*Prin*
if the fit statistics are required,

* R*
if a table of residuals is required,

* B*
if both statistics and residuals are required,

* It*
if the residuals after each iteration are required,

* All*
causes all the above items to be printed,

Inserting any other name, or leaving it as
*Prin*
, prevents their output,

*digi*
is an integer that controls the number of digits printed by
*mnlfit*
.

*mnlfit*
prints the number of iterations, the
*Std Deviation of Residuals*
(Goodness of Fit) statistic, the supplementary information required, and returns
*f*
.

**Procedure **
**mnlfit**

`> `
**restart: Digits := 40:**

`> `
**mnlfit := proc(f, X, x::array, y::vector, wt::vector, P::{name,list(name)},p::vector,**

tol,Prin,digi::integer)

local gradf,nobs,npar,k,yck,xk,r,a,drv,niter,j,n,corr,corrprint,alpha,beta,cvm,b,p1,errors,dp,

res,nu,res1,sigma1,reso,nxvar,yc,finals,hdg,ssr,ssyc,ssy,cf,msr,msyc,msy,F,pF,Rsq,Tests, anova,resi,rout;

# Initialize.

if type(P,name) then

RETURN(mnlfit(f,X,x,y,wt,[seq(P[k],k=1..linalg[vectdim](p))],p,tol,Prin,digi));

fi;

gradf := linalg[grad](f,P);

reso := 1.0*10^9;

nobs := linalg[vectdim](y);

npar := linalg[vectdim](p);

nxvar := linalg[coldim](x);

xk := vector(nxvar);

r := vector(nobs);

a := matrix(nobs,npar,0.0);

corr := matrix(npar,npar,0.0);

drv := vector(npar);

p1 := vector(npar);

errors := vector(npar);

niter := 50;

nu := 1000;

# Begin iterations.

for n from 1 to niter do

# Compute initial fit and residuals ( = observed - calculated).

for k from 1 to nobs do

for j from 1 to nxvar do xk[j] := x[k,j]

od; yck,drv := gry(f,gradf,X,P,npar,xk,p);

r[k] := evalf(wt[k]*(y[k] - yck));

for j from 1 to npar do

a[k,j] := wt[k]*drv[j];

od:

od:

# Sum of squares of residuals

res1 := evalf(linalg[dotprod](r,r));

if Prin = It or Prin = All then print(` Iteration No `, n ,` initial SSR = `, evalf(res1,digi));

fi; # Compute matrix alpha of normal equations.

alpha := evalf(evalm(linalg[transpose](a) &* a));

# Compute vector beta for right sides of normal equations.

beta := evalf(evalm(linalg[transpose](a) &* r));

# Note use of factor nu in this next loop; this increases as one approaches solution.

for j from 1 to npar do

alpha[j,j] := evalf(alpha[j,j]*(1. + 1./nu));

od:

# Solve a system of normal equations to correct parameters.

dp := evalf(linalg[linsolve](alpha, beta));

# Compute sum of squares of residuals at new point p1 = p + dp.

p1 := evalf(evalm(p + dp));

for k from 1 to nobs do

for j from 1 to nxvar do xk[j] := x[k,j] od;

yck,drv := gry(f,gradf,X,P,npar,xk,p1);

r[k] := evalf(wt[k]*(y[k]-yck));

# Store yc[k] ie calculated y.

yc[k] := yck;

od:

res := evalf(linalg[dotprod](r,r));

if Prin = It or Prin = All then print(` new SSR = `, evalf(res,digi));

fi; # Modify nu according to whether the sum of squares increases or decreases after this correction. # For the wrong direction, elements are not corrected.

if res1 <= res then

nu := nu/10.;

else

# For the right direction, correct the elements.

nu := nu*10.;

p := evalf(evalm(p + dp));

fi:

# Test for convergence.

if abs(res - reso) <= tol then

break

fi;

reso := res;

od:

# Covariance matrix

cvm := evalf(linalg[inverse](alpha));

# Standard deviation of residuals

sigma1 := sqrt(res/(nobs - npar));

# Errors of parameters

for j from 1 to npar do

errors[j] := evalf(sigma1*sqrt(cvm[j,j]));

od:

for j from 1 to npar do

for k from 1 to npar do

corr[j,k] := cvm[j,k]/sqrt(cvm[j,j]*cvm[k,k])

od:

od: # Print results.

print (array([[`No of Iterations`,`Std Deviation of Residuals`],[n,evalf(sigma1, digi)]]));

print();

if Prin = S or Prin = B or Prin = All then

print();

# Calculate Sums of Squares, Covariances & Tests.

cf := (add(wt[j]*y[j],j=1..nobs))^2/nobs;

ssyc := add((wt[j]*yc[j])^2,j=1..nobs) - cf;

ssr := add(wt[j]^2*(yc[j] - y[j])^2,j=1..nobs);

ssy := add((wt[j]*y[j])^2,j=1..nobs) - cf;

msyc := ssyc/(npar-1);

msy := ssy/(nobs - 1);

msr := ssr/(nobs - npar);

Rsq := ssyc/ssy;

pF := 1 - stats[statevalf,cdf,fratio[(npar - 1),(nobs - npar)]](msyc/msr);

anova := array([[`Source`,Sigma(`Squares`),`DF`,`Mean Square`], [`Regression`,trunc(ssyc),(npar-1),evalf(msyc,digi)], [`Residuals`,trunc(ssr),(nobs-npar),evalf(msr,digi)],[`Total`,trunc(ssy),(nobs - 1),` `]]);

Tests := array([[R^2,`F`,`Prob of F by chance`],[` `,` `,`for Normal data`],

[evalf(Rsq,3),evalf(msyc/msr,3),evalf(pF,3)]]);

# Print Anova and Tests

print(`Analysis of Covariance`);

print();

print(anova);

print();

print(`Tests of Covariance`);

print();

print(Tests);

print();

print (`Final values of parameters`); hdg := array([`Parameter`,`Value`,`Standard Error`]); finals := linalg[augment]([seq(j,j=1..npar)], convert(p,list), [seq(errors[j],j=1..npar)]); finals := evalf(evalm(finals),digi):

finals := linalg[stackmatrix](hdg,finals);

print(finals);

print();

corr := evalf(evalm(corr),digi):

cvm := evalf(evalm(cvm),digi):

print(`Matrix of Covariances`);

print();

print(cvm);

print();

print(` Matrix of Correlation Coefficients `);

print();

print(corr);

print();

fi;

if Prin =R or Prin = B or Prin = All then print(` Table of residuals `);

print();

resi := convert([seq([(seq(evalf(x[j,k],digi),k=1..nxvar),evalf(y[j],digi),evalf(yc[j],digi),

evalf(y[j]-yc[j],digi), evalf(wt[j]*(y[j]-yc[j])/sigma1, digi))],j=1..nobs)],matrix);

hdg := array([(seq(`x obs`[k],k=1..nxvar),` y obs `,` y calc `,` difference `,`wt*difference / sd of residuals`)]);

resi := linalg[stackmatrix](hdg,resi);

print(resi);

print();

fi;

RETURN(subs(seq(P[j]=evalf(p[j],digi),j=1..npar),f));

end:

gry := proc(f,G,x,a,n,xval,p::vector)

# Procedure to evaluate derivatives and formula

local k,y,i,drv,point;

point := {x=xval, seq(a[i]=p[i], i=1..n)};

drv := vector(n);

for k from 1 to n do

drv[k] := evalf(eval(G[k],point))

od:

y := evalf(eval(f,point));

RETURN(y,drv);

end:

**Problem**

A manufacturer wanted to join two parts with an adhesive instead of welding, but needed to know the maximum strength that could be obtained. The strength of a bond depends upon the cure time (mins), curing temperature (deg C) and quantity of accelerator added to the adhesive (% by weight). Although the relationship between strength and the variables was believed to follow that of a second order polynomial, the coefficients were unknown. Identical joints were therefore broken in a test machine and the force (kN) needed was measured. Fifteen combinations of cure time, temperature and accelerator were tested in a random order in accordance with a planned design. Each combination was tested 3 times in a random order.

Given the test results, the problem reduces to:

a. Fitting the functional relationship to the test data,

b. Estimating those values that produced the maximum strength,

c. Estimating the maximum strength.

**Solution**

**General**

In this example, the coefficients of the polynomial relating strength to the variables are assumed to be linear, and so a procedure for fitting a linear curve could be used. However, procedure
*mnlfit*
will be used to illustrate its versatility, although it is intended for fitting non-linear equations to data.

**Data**

First, the test data are imported,

`> `
**test_data := [[1, 45, 125, 3, 19.01], [2, 45, 125, 3, 19.97], [3, 45, 125, 3, 19.49], [4, 45, 125, 7, 23.95], [5, 45, 125, 7, 23.03], [6, 45, 125, 7, 23.49], [7, 45, 175, 3, 24.59], [8, 45, 175, 3, 23.59], [9, 45, 175, 3, 24.09], [10, 45, 175, 7, 49.63], [11, 45, 175, 7, 50.55], [12, 45, 175, 7, 50.09], [13, 75, 125, 3, 27.69], [14, 75, 125, 3, 28.09], [15, 75, 125, 3, 28.49], [16, 75, 125, 7, 32.45], [17, 75, 125, 7, 31.73], [18, 75, 125, 7, 32.09], [19, 75, 175, 3, 30.89], [20, 75, 175, 3, 30.09], [21, 75, 175, 3, 30.49], [22, 75, 175, 7, 36.11], [23, 75, 175, 7, 36.49], [24, 75, 175, 7, 36.87], [25, 30, 150, 5, 18.41], [26, 30, 150, 5, 17.37], [27, 30, 150, 5, 17.89], [28, 90, 150, 5, 32.55], [29, 90, 150, 5, 32.89], [30, 90, 150, 5, 33.23], [31, 60, 100, 5, 20.31], [32, 60, 100, 5, 19.47], [33, 60, 100, 5, 19.89], [34, 60, 200, 5, 28.47], [35, 60, 200, 5, 29.31], [36, 60, 200, 5, 28.89], [37, 60, 150, 1, 19.29], [38, 60, 150, 1, 18.83], [39, 60, 150, 1, 19.75], [40, 60, 150, 9, 28.89], [41, 60, 150, 9, 29.29], [42, 60, 150, 9, 29.69], [43, 60, 150, 5, 37.67], [44, 60, 150, 5, 38.51], [45, 60, 150, 5, 38.09]]:**

Next the data are displayed as a matrix, and then converted into a vector of strengths and an array of conditions (sets of variables).

`> `
**with(linalg):**

data := convert(test_data,matrix);

`Warning, the protected names norm and trace have been redefined and unprotected`

The first column in the data above gives the test number, the second the curing time, the third the curing temperature, the fourth the percentage of accelerator added to the adhesive and the fifth shows strength of the joint (force to break it). Test numbers 1 - 3 refer to one set of conditions (time, temperature and accelerator), test numbers 4 - 6 to refer to another set, 7 - 9 to a third set, and so on.

The conditions and strengths are separated from the data array thus,

`> `
**conds := delcols(delcols(data,5..5),1..1):**

strengths := col(data,5):

The function relating strength and the three conditions,
, is written as:

`> `
**str := c[1] + c[2]*x[1] + c[3]*x[2] + c[4]*x[3] + c[5]*x[1]^2 + c[6]*x[2]^2 + c[7]*x[3]^2 + c[8]*x[1]*x[2] + c[9]*x[1]*x[3] + c[10]*x[2]*x[3];**

A literature search suggested very approximate initial values for the 10 constants as follows. Unity weighting is assumed for the observed strengths.

`> `
**ini := vector(10,[-100,.1,.1,1,.001,.1,100,.001,.1,.1]):**

wt:= vector(45,1):

**Curve Fit**

The test data are inserted in procedure
*mnlfit*
to obtain estimates of the constants in the strength equation. Choosing
*tol*
as
,
*R*
to obtain the table of residuals, and the number of digits to be shown as 5,
*mnlfit*
is run.

`> `
**y := mnlfit(str,x,conds,strengths,wt,c,ini,10^(-7),R,5);**

Mostly, the residuals are fairly small, the largest being 7.8 kN, or 2.16 times the
*Std Deviation of Residuals*
. The residuals are now plotted to check whether any trends are apparent.

`> `
**Y := unapply(y,[x[1],x[2],x[3]]):**

cond := convert(conds,listlist):

resids :=[seq(wt[i]*(strengths[i]-Y(cond[i,1],cond[i,2],cond[i,3]))/3.6268,i=1..45)]:

Noting that the data comprises sets of 3 strengths for each set of conditions, there is a set of 3 (weighted) residuals for each set of conditions. These are plotted accordingly.

`> `
**cond_set := [seq(iquo(i,3),i=3..47)]:**

pts := zip((x,y)->[x,y],[seq(cond_set[i],i=1..45)],[seq(resids[i],i=1..45)]):

plots[pointplot](pts,labels=["Set of Conditions no.","Weighted(Observed Strength - Calculated)/sd of residuals"]);

The plot shows no clear trend. As already seen from the
*Table of Residuals*
, all the weighted residuals fall within about 2 std deviations, and so are considered satisfactory. In practice, though, it would be worth checking the data obtained for set number 4, where the residuals are around 2 standard deviations.

**Maximum Strength**

Up to this point, the precision used by Maple has been 40 digits, but this is unnecessary for the answers in the remainder of this worksheet and makes them difficult to read; 8 digits are adequate.

The Hessian of
*y*
is calculated to confirm that it is negative-definite, and thus the stationary value is a local maximum. The values of the variables
that produce the stationary value of
*y*
are obtained by equating the gradient of
*y*
to zero, and solving the three equations. Substituting the values obtained into the expression for
*y*
gives the maximum strength.

`> `
**Digits := 8:**

definite(hessian(y,[x[1],x[2],x[3]]),'negative_def');

e := grad(y,[x[1],x[2],x[3]]):

sols := solve(convert(e,set),{x[1],x[2],x[3]});

`max strength` := subs(sols,y);

The values of
that produce the maximum strength are seen to be 51 minutes cure time at 180 deg C, with 7.4% accelerator added to the adhesive.

In practice, the strength obtained with this choice of conditions will vary about the maximum calculated above, in much the same way as the residuals vary about zero (see the plot above). Provided that the distribution of strengths calculated from the polynomial does not depart unduly from the Normal, the bond strength will be within about twice the
*Std Deviations of Residuals*
from that calculated above, that is within about 2 x 3.5747 kN of 43.9 kN. Thus, in about 95% of joints, the strength will lie between roughly 36.7 and 51 kN.

**Yield Plots**

Since the strength is a function of three variables, it can be plotted only by holding one of them constant, while the other two are varied. The three plots are shown below.

`> `
**plot3d(Y(51,z[2],z[3]),z[2]=100..200,z[3]=1..9,orientation=[58,58],axes=FRAME,**

title="Time = 51 minutes",labels=["Temperature","Accelerator","Strength"]);

`> `
**plot3d(Y(z[1],180,z[3]),z[1]=30..90,z[3]=1..9,orientation=[60,58],axes=FRAME,**

title="Temperature = 180 deg C",labels=["Time","Accelerator","Strength"]);

`> `
**plot3d(Y(z[1],z[2],7.37),z[1]=30..90,z[2]=100..200,orientation=[57,60],axes=FRAME,**

title="Accelerator = 7.37%",labels=["Time","Temperature","Strength"]);

`> `

`> `

Unsurprisingly, the plots show that the strength depends strongly on curing temperature and accelerator added, but less so on time.

**Observations**

a. Procedure
*mnlfit*
easily found the ten coefficients for the strength polynomial. That it performs so well with both linear and non-linear curve fitting, suggests that it is the only curve-fitting tool that many users will need.

b. The maximum strength of bond between the parts concerned was found to be 43.9 kN, with a curing time of 51 minutes at 180 deg C, and 7.4% of accelerator added to the adhesive. About 95% of bonds made under these conditions would have a strength between roughly 36.7 and 51 kN. (In practice, the actual limits would be found from samples made during normal production.)

c. With 10 variables in the strength polynomial, the statistics output (
*Prin*
set to
*S*
) from
*mnlfit*
is somewhat overwhelming. For that reason it is not shown here, but readers who run
*mnlfit*
with the
*S*
setting will note that
is 85.9%, indicating that the regression accounts for much of the covariation, and that the value of
*F*
suggests the regression is a reasonable fit. A
test (not shown here) of the standardised residuals suggests that they do not depart unduly from Normal, and thus some credence can be given to the test of
*F*
.

**Disclaimer**
: While every effort has been made to validate the solutions in this worksheet, Waterloo Maple Inc. and the contributors are not responsible for any errors contained and are not liable for any damages resulting from the use of this material.