fit a linear model function to data - Maple Help

Online Help

All Products    Maple    MapleSim


Home : Support : Online Help : Statistics : Statistics Package : Regression : Statistics/LinearFit

Statistics[LinearFit] - fit a linear model function to data

Calling Sequence

LinearFit(flst, X, Y, v, options)

LinearFit(flst, XY, v, options)

LinearFit(falg, X, Y, v, options)

LinearFit(falg, XY, v, options)

LinearFit(fop, X, Y, options)

LinearFit(fop, XY, options)

Parameters

flst

-

list(algebraic) or Vector(algebraic); component functions in algebraic form

X

-

Vector or Matrix; values of independent variable(s)

Y

-

Vector; values of dependent variable

XY

-

Matrix; values of independent and dependent variables

v

-

name or list(names); name(s) of independent variables in the component functions

falg

-

algebraic expression, linear in all its variables except the ones in v; model

fop

-

list(procedure) or Vector(procedure); component functions in operator form

options

-

(optional) equation(s) of the form option=value where option is one of output, svdtolerance or weights; specify options for the LinearFit command

Description

• 

The LinearFit command fits a model function that is linear in the model parameters to data by minimizing the least-squares error.  It performs both simple and multiple linear regression.  This command accepts the model function in algebraic form in two variants, and in operator form, and data for independent and dependent variables can be specified together or separately.  For more information about the input forms, see the Input Forms help page.

• 

Consider the model y=fx1,x2,...,xn where y is the dependent variable and f is the model function of n independent variables x1,x2,...,xn.  This function is a linear combination a1f1+a2f2+amfm+... of component functions fjx1,x2,...,xn, for j from 1 to n.  Given k data points, where each data point is an (n+1)-tuple of numerical values for x1,x2,...,xn,y, the LinearFit command finds values of model parameters a1,a2,...,am such that the sum of the k residuals squared is minimized.  The ith residual is the value of yfx1,x2,...,xn evaluated at the ith data point.

• 

In the first two calling sequences, the first parameter flst is a list or Vector of component functions in algebraic form.  Each component is an algebraic expression in the independent variables x1,x2,...,xn.

• 

In the second pair of calling sequences, the first parameter is an algebraic expression for fx1,x2,...,xn, including the parameters a1,a2,...,am.

• 

In the last two calling sequences, the first parameter fop is a list or Vector of component functions in operator form. The jth component is a procedure having n input parameters representing the independent variables x1,x2,...,xn and returning the single value fjx1,x2,...,xn.

• 

The parameter X is a Matrix containing the values of the independent variables.  Row i in the Matrix contains the n values for the ith data point while column j contains all values of the single variable xj.  If there is only one independent variable, X can be either a Vector or a k-by-1 Matrix.  The parameter Y is a Vector containing the k values of the dependent variable y. The parameter XY is a Matrix consisting of the n columns of X and, as last column, Y. For X, Y, and XY, one can also use lists or Arrays; for details, see the Input Forms help page.

• 

The parameter v is a list of the independent variable names used in falg.  If there is only one independent variable, then v can be a single name.  The order of the names in the list must match exactly the order in which the independent variable values are placed in the columns of X.

• 

By default, either the model function with the final parameter values or a Vector containing the parameter values is returned, depending on the input form.  Additional results or a solution module that allows you to query for various settings and results can be obtained with the output option.  For more information, see the Statistics/Regression/Solution help page.

• 

Weights for the data points can be supplied through the weights option.

Options

  

The options argument can contain one or more of the options shown below.  These options are described in more detail on the Statistics/Regression/Options help page.

• 

output = name or string -- Specify the form of the solution.  The output option can take as a value the name solutionmodule, or one of the following names (or a list of these names): AtkinsonTstatistic, confidenceintervals, CookDstatistic, degreesoffreedom, externallystandardizedresiduals, internallystandardizedresiduals, leastsquaresfunction, leverages, parametervalues, parametervector, residuals, residualmeansquare, residualstandarddeviation, residualsumofsquares, standarderrors, variancecovariancematrix. For more information, see the Statistics/Regression/Solution help page.

• 

svdtolerance = realcons(nonnegative) -- Set the tolerance that determines whether a singular-value decomposition is performed.

• 

weights = Vector -- Provide weights for the data points.

Notes

• 

The underlying computation is done in floating-point; therefore, all data points must have type realcons and all returned solutions are floating-point, even if the problem is specified with exact values.  For more information about numeric computation in the Statistics package, see the Statistics/Computation help page.

• 

The LinearFit command uses various methods implemented in a built-in library provided by the Numerical Algorithms Group (NAG).  Normally, a method using QR decomposition is applied.  If it is determined that the system does not have full rank, then a singular-value decomposition (SVD) is performed. The svdtolerance option allows you to specify when an SVD should be performed.  See the Statistics/Regression/Options help page for additional details.

• 

To obtain more details as the least-squares problem is being solved, set infolevel[Statistics] to 1 or higher.

Examples

withStatistics:

A simple example using the first form for the first argument, flst:

X:=Vector1,2,3,4,5,6,datatype=float:

Y:=Vector2,3,4,3.5,5.8,7,datatype=float:

LinearFit1,t,t2,X,Y,t

1.96000000000000+0.164999999999999t+0.110714285714286t2

(1)

Here is the same example using the second form for the first argument, falg:

LinearFita+bt+ct2,X,Y,t

1.96000000000000+0.164999999999999t+0.110714285714286t2

(2)

And finally using the third form, fop:

constant_function:=t→1

constant_function:=t→1

(3)

linear_function:=t→t

linear_function:=t→t

(4)

quadratic_function:=t→t2

quadratic_function:=t→t2

(5)

LinearFitconstant_function,linear_function,quadratic_function,X,Y

1.960000000000000.1649999999999990.110714285714286

(6)

Use the output=solutionmodule option to see the full results.

m:=LinearFit1,t,t2,X,Y,t,output=solutionmodule

m:=moduleexportResults,Settings;end module

(7)

m:-Results

residualmeansquare=0.429238095238095,residualsumofsquares=1.28771428571429,residualstandarddeviation=0.655162647926525,degreesoffreedom=3,parametervalues=1.960000000000000.1649999999999990.110714285714286,parametervector=1.960000000000000.1649999999999990.110714285714286,leastsquaresfunction=1.96000000000000+0.164999999999999t+0.110714285714286t2,standarderrors=1.171990573665980.7667482580068000.107226158093964,confidenceintervals=1.76980025750745..5.689800257507462.27513724548285..2.605137245482840.230527496478056..0.451956067906628,residuals=0.2357142857142860.2671428571428570.5485714285714290.8914285714285710.2471428571428570.0642857142857140,leverages=0.8214285714285720.3071428571428570.3714285714285710.3714285714285710.3071428571428570.821428571428572,variancecovariancematrix=1.373561904761900.8370142857142860.1073095238095240.8370142857142860.5879028911564630.08048214285714280.1073095238095240.08048214285714280.0114974489795918,internallystandardizedresiduals=0.8513943978432960.4898606875576461.056104119396911.716169194019980.4531866253875550.232198472139079,externallystandardizedresiduals=0.7982573178275230.4169943516423311.0879453019028910.37123092690660.3833809722980480.191316224811661,CookDstatistic=1.111471045041060.03545852305230700.2196913158044330.5801223807960790.03034796924225740.0826714000443752,AtkinsonTstatistic=1.712071210300520.2776377607337470.8363102061252447.972428631368750.2552577372751900.410327588921895

(8)

Consider now an experiment where quantities x, y, and z are quantities influencing a quantity w according to an approximate relationship

w=ax+bx2y+cyz

with unknown parameters a, b, and c. Six data points are given by the following matrix, with respective columns for x, y, z, and w.

ExperimentalData:=1,1,1,2,2,2|1,2,3,1,2,3|1,2,3,4,5,6|0.531,0.341,0.163,0.641,0.713,0.040

ExperimentalData:=1110.5311220.3411330.1632140.6412250.7132360.040

(9)

We can find the fitted model function as follows:

LinearFitx,x2y,yz,ExperimentalData,x,y,z

0.823072918385878x0.167910114211606x2y0.0758022678386438yz

(10)

Alternatively, if we have the input and output data separately, we can use the following calling sequence.

Input:=ExperimentalData..,..3

Input:=111122133214225236

(11)

Output:=ExperimentalData..,4

Output:=0.5310.3410.1630.6410.7130.040

(12)

LinearFitx,x2y,yz,Input,Output,x,y,z

0.823072918385878x0.167910114211606x2y0.0758022678386438yz

(13)

We might want to know the residuals and the parameter values instead of just the model function.

LinearFitx,x2y,yz,ExperimentalData,x,y,z,output=parametervector,residuals

0.8230729183858780.1679101142116060.0758022678386438,0.04836053633562850.09490878992549990.07811753022685410.03029630857075830.1606970700378930.0978248634499976

(14)

See Also

CurveFitting, Statistics, Statistics/Computation, Statistics/Fit, Statistics/Regression, Statistics/Regression/InputForms, Statistics/Regression/Options, Statistics/Regression/Solution


Download Help Document

Was this information helpful?



Please add your Comment (Optional)
E-mail Address (Optional)
What is ? This question helps us to combat spam