Statistics - Maple Programming Help

Online Help

All Products    Maple    MapleSim


Home : Support : Online Help : Statistics and Data Analysis : Statistics Package : Regression : Statistics/LinearFit

Statistics

  

LinearFit

  

fit a linear model function to data

 

Calling Sequence

Parameters

Description

Options

Notes

Examples

Compatibility

Calling Sequence

LinearFit(flst, X, Y, v, options)

LinearFit(flst, XY, v, options)

LinearFit(falg, X, Y, v, options)

LinearFit(falg, XY, v, options)

LinearFit(fop, X, Y, options)

LinearFit(fop, XY, options)

Parameters

flst

-

list(algebraic) or Vector(algebraic); component functions in algebraic form

X

-

Vector or Matrix; values of independent variable(s)

Y

-

Vector; values of dependent variable

XY

-

Matrix; values of independent and dependent variables

v

-

name or list(names); name(s) of independent variables in the component functions

falg

-

algebraic expression, linear in all its variables except the ones in v; model

fop

-

list(procedure) or Vector(procedure); component functions in operator form

options

-

(optional) equation(s) of the form option=value where option is one of output, summarize, svdtolerance or weights; specify options for the LinearFit command

Description

• 

The LinearFit command fits a model function that is linear in the model parameters to data by minimizing the least-squares error.  It performs both simple and multiple linear regression.  This command accepts the model function in algebraic form in two variants, and in operator form, and data for independent and dependent variables can be specified together or separately.  For more information about the input forms, see the Input Forms help page.

• 

Consider the model y=fx1,x2,...,xn where y is the dependent variable and f is the model function of n independent variables x1,x2,...,xn.  This function is a linear combination a1f1+a2f2+amfm+... of component functions f[j]x1,x2,...,xn, for j from 1 to n.  Given k data points, where each data point is an (n+1)-tuple of numerical values for x1,x2,...,xn,y, the LinearFit command finds values of model parameters a1,a2,...,am such that the sum of the k residuals squared is minimized.  The ith residual is the value of yfx1,x2,...,xn evaluated at the ith data point.

• 

In the first two calling sequences, the first parameter flst is a list or Vector of component functions in algebraic form.  Each component is an algebraic expression in the independent variables x1,x2,...,xn.

• 

In the second pair of calling sequences, the first parameter is an algebraic expression for fx1,x2,...,xn, including the parameters a1,a2,...,am.

• 

In the last two calling sequences, the first parameter fop is a list or Vector of component functions in operator form. The jth component is a procedure having n input parameters representing the independent variables x1,x2,...,xn and returning the single value f[j]x1,x2,...,xn.

• 

The parameter X is a Matrix containing the values of the independent variables.  Row i in the Matrix contains the n values for the ith data point while column j contains all values of the single variable xj.  If there is only one independent variable, X can be either a Vector or a k-by-1 Matrix.  The parameter Y is a Vector containing the k values of the dependent variable y. The parameter XY is a Matrix consisting of the n columns of X and, as last column, Y. For X, Y, and XY, one can also use lists or Arrays; for details, see the Input Forms help page.

• 

The parameter v is a list of the independent variable names used in falg.  If there is only one independent variable, then v can be a single name.  The order of the names in the list must match exactly the order in which the independent variable values are placed in the columns of X.

• 

By default, either the model function with the final parameter values or a Vector containing the parameter values is returned, depending on the input form.  Additional results or a solution module that allows you to query for various settings and results can be obtained with the output option.  For more information, see the Statistics/Regression/Solution help page.

• 

Weights for the data points can be supplied through the weights option.

Options

  

The options argument can contain one or more of the options shown below.  These options are described in more detail on the Statistics/Regression/Options help page.

• 

output = name or string -- Specify the form of the solution.  The output option can take as a value the name solutionmodule, or one of the following names (or a list of these names): AtkinsonTstatistic, confidenceintervals, CookDstatistic, degreesoffreedom, externallystandardizedresiduals, internallystandardizedresiduals, leastsquaresfunction, leverages, parametervalues, parametervector, residuals, residualmeansquare, residualstandarddeviation, residualsumofsquares, rsquared, rsquaredadjusted, standarderrors, tprobability, tvalue, variancecovariancematrix. For more information, see the Statistics/Regression/Solution help page.

• 

summarize = identical( true, false, embed ) -- Display a summary of the regression model

• 

svdtolerance = realcons(nonnegative) -- Set the tolerance that determines whether a singular-value decomposition is performed.

• 

weights = Vector -- Provide weights for the data points.

Notes

• 

The underlying computation is done in floating-point; therefore, all data points must have type realcons and all returned solutions are floating-point, even if the problem is specified with exact values.  For more information about numeric computation in the Statistics package, see the Statistics/Computation help page.

• 

The LinearFit command uses various methods implemented in a built-in library provided by the Numerical Algorithms Group (NAG).  Normally, a method using QR decomposition is applied.  If it is determined that the system does not have full rank, then a singular-value decomposition (SVD) is performed. The svdtolerance option allows you to specify when an SVD should be performed.  See the Statistics/Regression/Options help page for additional details.

• 

To obtain more details as the least-squares problem is being solved, set infolevel[Statistics] to 2 or higher.

Examples

withStatistics:

A simple example using the first form for the first argument, flst:

XVector1,2,3,4,5,6,datatype=float:

YVector2,3,4,3.5,5.8,7,datatype=float:

LinearFit1,t,t2,X,Y,t

1.96000000000000+0.164999999999999t+0.110714285714286t2

(1)

The summarize option returns a summary for the regression:

lsLinearFit1,t,t2,X,Y,t,summarize=true:

Summary:
----------------
Model: 1.9600000+.16500000*t+.11071429*t^2
----------------
Coefficients:
              Estimate  Std. Error  t-value  P(>|t|)
Parameter 1    1.9600    1.1720      1.6724   0.1930
Parameter 2    0.1650    0.7667      0.2152   0.8434
Parameter 3    0.1107    0.1072      1.0325   0.3778
----------------
R-squared: 0.9252, Adjusted R-squared: 0.8753

ls

1.96000000000000+0.164999999999999t+0.110714285714286t2

(2)

Here is the same example using the second form for the first argument, falg:

LinearFita+bt+ct2,X,Y,t

1.96000000000000+0.164999999999999t+0.110714285714286t2

(3)

The summary can also be returned as an embedded table:

LinearFit1,t,t2,X,Y,t,summarize=embed

1.96000000000000+0.164999999999999t+0.110714285714286t2

(4)

Model:

1.9600000+0.16500000t+0.11071429t2

Coefficients

Estimate

Standard Error

t-value

P(>|t|)

Parameter 1

1.96000

1.17199

1.67237

0.193045

Parameter 2

0.165000

0.766748

0.215194

0.843415

Parameter 3

0.110714

0.107226

1.03253

0.377769

R-squared:

0.925169

Adjusted R-squared:

0.875282

Residuals

Residual Sum of Squares

Residual Mean Square

Residual Standard Error

Degrees of Freedom

1.28771

0.429238

0.655163

3

Five Point Summary

Minimum

First Quartile

Median

Third Quartile

Maximum

0.891429

0.290357

0.155714

0.290595

0.548571

And finally using the third form, fop:

constant_functiont→1

constant_functiont→1

(5)

linear_functiont→t

linear_functiont→t

(6)

quadratic_functiont→t2

quadratic_functiont→t2

(7)

LinearFitconstant_function,linear_function,quadratic_function,X,Y

1.960000000000000.1649999999999990.110714285714286

(8)

Use the output=solutionmodule option to see the full results.

mLinearFit1,t,t2,X,Y,t,output=solutionmodule

mmodule...end module

(9)

m:-Results

residualmeansquare=0.429238095238095,residualsumofsquares=1.28771428571429,residualstandarddeviation=0.655162647926525,degreesoffreedom=3,parametervalues=1.960000000000000.1649999999999990.110714285714286,parametervector=1.960000000000000.1649999999999990.110714285714286,leastsquaresfunction=1.96000000000000+0.164999999999999t+0.110714285714286t2,standarderrors=1.171990573665980.7667482580068000.107226158093964,confidenceintervals=1.76980025750745..5.689800257507462.27513724548285..2.605137245482840.230527496478056..0.451956067906628,rsquared=0.925169145624351,rsquaredadjusted=0.875281909373919,residuals=0.2357142857142860.2671428571428570.5485714285714290.8914285714285710.2471428571428570.0642857142857140,leverages=0.8214285714285720.3071428571428570.3714285714285710.3714285714285710.3071428571428570.821428571428572,variancecovariancematrix=1.373561904761900.8370142857142860.1073095238095240.8370142857142860.5879028911564630.08048214285714280.1073095238095240.08048214285714280.0114974489795918,internallystandardizedresiduals=0.8513943978432960.4898606875576461.056104119396911.716169194019980.4531866253875550.232198472139079,externallystandardizedresiduals=0.7982573178275230.4169943516423311.0879453019028910.37123092690660.3833809722980480.191316224811661,CookDstatistic=1.111471045041060.03545852305230700.2196913158044330.5801223807960790.03034796924225740.0826714000443752,AtkinsonTstatistic=1.712071210300520.2776377607337470.8363102061252447.972428631368750.2552577372751900.410327588921895,tvalue=1.67236839957606,0.215194489556356,1.03253056607013,tprobability=0.193045057943908,0.843415034784358,0.377768512636454

(10)

Consider now an experiment where quantities x, y, and z are quantities influencing a quantity w according to an approximate relationship

w=ax+bx2y+cyz

with unknown parameters a, b, and c. Six data points are given by the following matrix, with respective columns for x, y, z, and w.

ExperimentalData1,1,1,2,2,2|1,2,3,1,2,3|1,2,3,4,5,6|0.531,0.341,0.163,0.641,0.713,0.040

ExperimentalData1110.5311220.3411330.1632140.6412250.7132360.040

(11)

We can find the fitted model function as follows:

LinearFitx,x2y,yz,ExperimentalData,x,y,z

0.823072918385878x0.167910114211606x2y0.0758022678386438yz

(12)

Alternatively, if we have the input and output data separately, we can use the following calling sequence.

InputExperimentalData..,..3

Input111122133214225236

(13)

OutputExperimentalData..,4

Output0.5310.3410.1630.6410.7130.040

(14)

LinearFitx,x2y,yz,Input,Output,x,y,z

0.823072918385878x0.167910114211606x2y0.0758022678386438yz

(15)

We might want to know the residuals and the parameter values instead of just the model function.

LinearFitx,x2y,yz,ExperimentalData,x,y,z,output=parametervector,residuals

0.8230729183858780.1679101142116060.0758022678386438,0.04836053633562850.09490878992549990.07811753022685410.03029630857075830.1606970700378930.0978248634499976

(16)

Compatibility

• 

The XY parameter was introduced in Maple 15.

• 

For more information on Maple 15 changes, see Updates in Maple 15.

• 

The falg parameter was introduced in Maple 17.

• 

For more information on Maple 17 changes, see Updates in Maple 17.

• 

The Statistics[LinearFit] command was updated in Maple 2016.

• 

The summarize option was introduced in Maple 2016.

• 

For more information on Maple 2016 changes, see Updates in Maple 2016.

See Also

CurveFitting

Statistics

Statistics/Computation

Statistics/Fit

Statistics/Regression

Statistics/Regression/InputForms

Statistics/Regression/Options

Statistics/Regression/Solution

 


Download Help Document

Was this information helpful?



Please add your Comment (Optional)
E-mail Address (Optional)
What is ? This question helps us to combat spam