Efficient Computation in the Statistics Package
This help page provides details about how numeric computation is performed by commands in the Statistics package. Included are suggestions on how to get the most efficient performance from the solvers.
The Floating-Point EnvironmentData SetsSolving Regression Problems
<Text-field style="Heading 2" layout="Heading 2" bookmark="bkmrk0">The Floating-Point Environment</Text-field>
Most of the Statistics package commands rely on external libraries that perform their computation in floating-point. These libraries include statistics and optimization routines provided by the Numerical Algorithms Group (NAG).
The solvers used by the Statistics package will choose either the hardware floating-point environment or the arbitrary-precision software floating-point environment to perform the computations. To maximize efficiency, the solvers attempt to use hardware floats whenever possible. Software floats are used only when the environment variable UseHardwareFloats is set to 'false', or when UseHardwareFloats is 'deduced' and Digits is greater than the value of evalhf(Digits).
<Text-field style="Heading 2" layout="Heading 2" bookmark="datasets">Data Sets</Text-field>
A data set is generally provided as an Array or a Vector. Most Statistics commands will accept a DataSeries, list, or array in the place of an Array or Vector. For greatest efficiency, it is recommended that data sets be provided as Vectors and Arrays having datatype float.
One can also often supply two-dimensional data from a Matrix or DataFrame. In some cases, such as when supplying dependent and independent data for Statistics[Fit], the nature of the command is such that two-dimensional data is essential. For commands that generally work on one-dimensional data sets, such as Statistics[Mean], this will apply the command to each column of data separately. This is known as Matrix data sets and explained on the DataFrames in Statistics help page.
<Text-field style="Heading 2" layout="Heading 2" bookmark="bkmrk1">Solving Regression Problems</Text-field>
To ensure maximum efficiency, special care must be taken when specifying regression problems. The regression commands, described in the Statistics/Regression help page, allow problems to be specified in one of three forms: algebraic form, operator form, and Matrix form. The algebraic and operator forms are easier to use. However, the Matrix form is most similar to the internal representation used by the solvers and leads to greatest efficiency. The Statistics/Regression/InputForms help page provides a summary of all three forms and links to more information.
When using operator or Matrix form with the regression commands, a procedure must be provided. If possible, the procedure should be written so that it can be evaluated by the evalhf command. Procedures that contain any Maple constructs not supported by evalhf are evaluated using the slower evalf command. For more information on evalhf construct support, see the evalhf and evalhf/procedure help pages.
See AlsoDigitsevalhfevalhf/procedureMatrixStatisticsStatistics/RegressionStatistics/Regression/InputFormsVector