Overview - Maple Help

All Products Maple MapleSim

Home : Support : Online Help : Statistics and Data Analysis : Statistics Package : Regression : Overview

Regression Commands

The Statistics package provides various commands for fitting linear and nonlinear models to data points and performing regression analysis. The fitting algorithms are based on least-squares methods, which minimize the sum of the residuals squared.

Using the Regression Commands

Examples

References

Available Commands

ExponentialFit	fit an exponential function to data
Fit	fit a model function to data
LeastTrimmedSquares	robust linear regression
LinearFit	fit a linear model function to data
LogarithmicFit	fit a logarithmic function to data
Lowess	produce lowess smoothed functions
NonlinearFit	fit a nonlinear model function to data
OneWayANOVA	generate a one-way ANOVA table
PolynomialFit	fit a polynomial to data
PowerFit	fit a power function to data
PredictiveLeastSquares	fit a predictive linear model function to data
RepeatedMedianEstimator	robust linear regression

Linear Fitting

•	A number of commands are available for fitting a model function that is linear in the model parameters to given data. For example, the model function $b t^{2} + a t$ is linear in the parameters a and b, though it is nonlinear in the independent variable t.

•

The LinearFit command is available for multiple general linear regression. For certain classes of model functions involving only one independent variable, the PolynomialFit, LogarithmicFit, PowerFit, and ExponentialFit commands are available. The PowerFit and ExponentialFit commands use a transformed model function that is linear in the parameters.

Nonlinear Fitting

•	The NonlinearFit command is available for nonlinear fitting. An example model function is $a x + {&ExponentialE;}^{b y}$ where a and b are the parameters, and x and y are the independent variables.

•	This command relies on local nonlinear optimization solvers available in the Optimization package. The LSSolve and NLPSolve commands in that package can also be used directly for least-squares and general nonlinear minimization.

Other Commands

•	The general Fit command allows you to provide either a linear or nonlinear model function. It then determines the appropriate regression solver to use.

•	The OneWayANOVA command generates the standard ANOVA table for one-way classification, given two or more groups of observations.

Using the Regression Commands

•

Various options can be provided to the regression commands. For example, the weights option allows you to specify weights for the data points and the output option allows you to control the format of the results. The options available for each command are described briefly in the command's help page and in greater detail in the Statistics/Regression/Options help page.

•	The format of the solutions returned by the regression commands is described in the Statistics/Regression/Solution help page.

•	Most of the regression commands use methods implemented in a built-in library provided by the Numerical Algorithms Group (NAG). The underlying computation is done in floating-point. Either hardware or software (arbitrary precision) floating-point computation can be specified.

•

The model function and data sets may be provided in different ways. Full details are available in the Statistics/Regression/InputForms help page. The regression routines work primarily with Vectors and Matrices. In most cases, lists (both flat and nested) and Arrays are also accepted and automatically converted to Vectors or Matrices. Consequently, all output, including error messages, uses these data types.

Examples

>	$with (Statistics) &colon;$

Define Vectors X and Y, containing values of an independent variable x and a dependent variable y.

>	$X ≔ Vector ([1.2, 2.1, 3.1, 4.0, 5.7, 6.6, 7.2, 7.9, 9.1, 10.3]) &colon;$

>	$Y ≔ Vector ([4.6, 7.7, 11.5, 15.4, 22.2, 33.1, 48.1, 70.6, 109.0, 168.4]) &colon;$

Find the values of a and b that minimize the least-squares error when the model function $a t + b {&ExponentialE;}^{x}$ is used.

>	$Fit (a x + b \exp (x), X, Y, x)$

$6.02861839712210 x + 0.00380375570529786 {&ExponentialE;}^{x}$

(1)

It is also possible to return a summary of the regression model using the summarize option:

>	$Fit (a x + b \exp (x), X, Y, x, summarize = embed)$

$6.02861839712210 x + 0.00380375570529786 {&ExponentialE;}^{x}$

(2)

Model:

$6.0286184 x + 0.0038037557 {&ExponentialE;}^{x}$

Coefficients	Estimate	Standard Error	t-value	P(>\|t\|)
a	$6.02862$	$0.761415$	$7.91765$	$0.0000470413$
b	$0.00380376$	$0.000494423$	$7.69332$	$0.0000577943$

R-squared:

$0.978977$

Adjusted R-squared:

$0.973721$

Residuals

Residual Sum of Squares	Residual Mean Square	Residual Standard Error	Degrees of Freedom
$1042.23$	$130.279$	$11.4140$	$8$

Five Point Summary

Minimum	First Quartile	Median	Third Quartile	Maximum
$−13.2999$	$−8.96906$	$−5.89077$	$0.691999$	$20.0758$

Fit a polynomial of degree 3 through this data.

>	$PolynomialFit (3, X, Y, x)$

$- 3.37372868459017 + 9.90059487215674 x - 2.79612412098216 x^{2} + 0.336249676048196 x^{3}$

(3)

Use the output option to see the residual sum of squares and the standard errors.

>	$PolynomialFit (3, X, Y, x, output = [residualsumofsquares, standarderrors])$

$[47.8471318673565, [\begin{array}{c} 6.32596510474709 & 4.47306023272056 & 0.861783833283665 & 0.0487355015438641 \end{array}]]$

(4)

Fit the model function $a x + {&ExponentialE;}^{b x}$ , which is nonlinear in the parameters.

>	$NonlinearFit (a x + \exp (b x), X, Y, x)$

$2.12883148575966 x + {&ExponentialE;}^{0.486510105685615 x}$

(5)

Consider now an experiment where quantities $x$ , $y$ , and $z$ are quantities influencing a quantity $w$ according to an approximate relationship

$w = x^{a} + \frac{b x^{2}}{y} + c y z$

with unknown parameters $a$ , $b$ , and $c$ . Six data points are given by the following matrix, with respective columns for $x$ , $y$ , $z$ , and $w$ .

>	$ExperimentalData ≔ ⟨⟨1, 1, 1, 2, 2, 2⟩ \| ⟨1, 2, 3, 1, 2, 3⟩ \| ⟨1, 2, 3, 4, 5, 6⟩ \| ⟨0.531, 0.341, 0.163, 0.641, 0.713, - 0.040⟩⟩$

$ExperimentalData ≔ [\begin{array}{c} 1 & 1 & 1 & 0.531 \\ 1 & 2 & 2 & 0.341 \\ 1 & 3 & 3 & 0.163 \\ 2 & 1 & 4 & 0.641 \\ 2 & 2 & 5 & 0.713 \\ 2 & 3 & 6 & −0.040 \end{array}]$

(6)

We take an initial guess that the first term will be approximately quadratic in $x$ , that $b$ will be approximately $1$ , and for $c$ we don't even know whether it's going to be positive or negative, so we guess $c = 0$ . We compute both the model function and the residuals. Also, we select more verbose operation by setting $infolevel$ .

>	$infolevel [Statistics] ≔ 2 &colon;$

>	$NonlinearFit (x^{a} + \frac{b x^{2}}{y} + c y z, ExperimentalData, [x, y, z], initialvalues = [a = 2, b = 1, c = 0], output = [leastsquaresfunction, residuals])$

In NonlinearFit (algebraic form)

$[x^{1.14701973996968} - \frac{0.298041864889394 x^{2}}{y} - 0.0982511893429762 y z, [\begin{array}{c} 0.0727069457676300 & 0.116974310183398 & −0.146607992383251 & −0.0116127470057686 & −0.0770361532848388 & 0.0886489085642805 \end{array}]]$

(7)

We note that Maple selected the nonlinear fitting method. Furthermore, the exponent on $x$ is only about $1.14$ , and the other guesses were not very good either. However, this problem is conditioned well enough that Maple finds a good fit anyway.

Now suppose that the relationship that is used to model the data is altered as follows:

$w = a x + \frac{b x^{2}}{y} + c y z$

We adapt the calling sequence very slightly:

>	$Fit (a x + \frac{b x^{2}}{y} + c y z, ExperimentalData, [x, y, z], initialvalues = [a = 2, b = 1, c = 0], output = [leastsquaresfunction, residuals])$

In Fit

In LinearFit (container form)

final value of residual sum of squares: .0537598869493245

Summary:
----------------
Model: .82307292*x-.16791011*x^2/y-.75802268e-1*y*z
----------------
Coefficients:
    Estimate Std. Error t-value P(>|t|)
a    0.8231    0.1898      4.3374   0.0226
b   -0.1679    0.0940     -1.7862   0.1720
c   -0.0758    0.0182     -4.1541   0.0254
----------------
R-squared: 0.9600, Adjusted R-squared: 0.9201

$[0.823072918385878 x - \frac{0.167910114211606 x^{2}}{y} - 0.0758022678386438 y z, [\begin{array}{c} −0.0483605363356285 & −0.0949087899254999 & 0.0781175302268541 & −0.0302963085707583 & 0.160697070037893 & −0.0978248634499976 \end{array}]]$

(8)

>	$infolevel [Statistics] ≔ 0 &colon;$

This time, Maple could select the linear fitting method, because the expression is linear in the parameters. In addition, as the infolevel is greater than 0 and the expression is linear in the parameters, a summary for the regression is displayed. The initial values for the parameters are not used.

Finally, consider a situation where an ordinary differential equation leads to results that need to be fitted. The system is given by

$[x (0) = - a, \frac{&DifferentialD;}{&DifferentialD; t} x (t) = z {x (t)}^{- b} + 1]$

where $a$ and $b$ are parameters that we want to find, $z$ is a variable that we can vary between experiments, and $x (t)$ is a quantity that we can measure at $t = 1$ . We perform 10 experiments at $z = 0.1, 0.2, ..., 1.0$ , and the results are as follows.

>	$Input ≔ [seq (0.1 .. 1, 0.1)]$

$Input ≔ [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]$

(9)

>	$Output ≔ [1.932, 2.092, 2.090, 2.416, 2.544, 2.638, 2.894, 3.188, 3.533, 3.822]$

$Output ≔ [1.932, 2.092, 2.090, 2.416, 2.544, 2.638, 2.894, 3.188, 3.533, 3.822]$

(10)

We now need to set up a procedure that NonlinearFit can call to obtain the value for a given input value $z$ and a given pair of parameters $a$ and $b$ . We do this using dsolve/numeric.

>	$ODE ≔ [x (0) = - a, diff (x (t), t) = z {x (t)}^{- b} + 1]$

$ODE ≔ [x (0) = - a, \frac{&DifferentialD;}{&DifferentialD; t} x (t) = z {x (t)}^{- b} + 1]$

(11)

>	$ODE_Solution ≔ dsolve (ODE, numeric, parameters = [a, b, z])$

$ODE_Solution ≔ proc (x_rkf45) ... end proc$

(12)

We now have a procedure ODE_Solution that can compute the correct value, but we need to write a wrapper that has the form that NonlinearFit expects. We first need to call ODE_Solution once to set the parameters, then another time to obtain the value of $x (t)$ at $t = 1$ , and then return this value (for more information about how this works, see dsolve/numeric). By hand, we can do this as follows:

>	$ODE_Solution (parameters = [a = - 1, b = - 0.5, z = 1])$

$[a = −1., b = −0.5, z = 1.]$

(13)

>	$ODE_Solution (1)$

$[t = 1., x (t) = 3.44630585135012]$

(14)

>	$ODE_Solution (parameters = [a = 1, b = 1, z = 1])$

$[a = 1., b = 1., z = 1.]$

(15)

>	$ODE_Solution (1)$

Error, (in ODE_Solution) cannot evaluate the solution past the initial point, problem may be complex, initially singular or improperly set up

Note that for some settings of the parameters, we cannot obtain a solution. We need to take care of this in the procedure we create (which we call f), by returning a value that is very far from all output points, leading to a very bad fit for these erroneous parameter values.

>	f := proc(zValue, aValue, bValue) global ODE_Solution, a, b, z, x, t; ODE_Solution('parameters' = [a = aValue, b = bValue, z = zValue]); try return eval(x(t), ODE_Solution(1)); catch: return 100; end try; end proc;

$f ≔ proc (zValue, aValue, bValue) global ODE_Solution, a, b, z, x, t &semi; ODE_Solution (' parameters' = [a = aValue, b = bValue, z = zValue]) &semi; try return eval (x (t), ODE_Solution (1)) catch &colon; return 100 end try end proc$

(16)

>	$f (1, - 1, - 0.5)$

$3.44630585135012$

(17)

We need to provide an initial estimate for the parameter values, because the fitting procedure is only performed in a local sense. We go with the values that provided a solution above: $a = −1, b = −0.5$ .

>	$NonlinearFit (f, Input, Output, output = parametervector, initialvalues = [- 1, - 0.5])$

$[\begin{array}{c} −0.739350291476489 \\ −1.03096957779941 \end{array}]$

(18)

References

Draper, Norman R., and Smith, Harry. Applied Regression Analysis. 3rd ed. New York: Wiley, 1998.

Applications

Parameter Estimation for an N-Channel Enhancement MOSFET

Maple

Maple Add-Ons

Math Success Platform

Improving Retention Rates

Maple Flow

MapleSim

Consulting Services

Maple T.A. and Möbius

Education

Industries

Automotive and Aerospace

Robotics

Machine Design & Industrial Automation

Other

Application Areas

Product Pricing

Purchasing

Institutional Student Licensing

Maplesoft Elite Maintenance (EMP)

Support

Product Training

Online Product Help

Webinars & Events

Publications

Content Hubs

Examples & Applications

Community

About Maplesoft

Media Center

User Community

Contact

Online Help

All Products Maple MapleSim

Maple

Powerful math software that is easy to use

Maple Add-Ons

Math Success Platform

Improving Retention Rates

Maple Flow

Engineering calculations & documentation

MapleSim

Advanced System Level Modeling

Consulting Services

Maple T.A. and Möbius

Education

Industries

Automotive and Aerospace

Robotics

Machine Design & Industrial Automation

Other

Application Areas

Product Pricing

Purchasing

Institutional Student Licensing

Maplesoft Elite Maintenance (EMP)

Support

Product Training

Online Product Help

Webinars & Events

Publications

Content Hubs

Examples & Applications

Community

About Maplesoft

Media Center

User Community

Contact

Online Help

All Products Maple MapleSim