FivePointSummary - Maple Help

Statistics

 FivePointSummary
 compute the five-point summary for a data sample

 Calling Sequence FivePointSummary(data, options)

Parameters

 data - options - (optional) equation(s) of the form option=value where option is one of ignore, output, summarize, tableweights, or weights; specify options for the FivePointSummary function

Options

 The options argument can contain one or more of the options shown below. Some of these options are described in more detail in the Statistics[DescriptiveStatistics] help page.
 • ignore : truefalse; This option controls how missing data is handled by the FivePointSummary command. Missing items are represented by undefined or Float(undefined). So, if ignore=false and A contains missing data, most of the statistics command will yield undefined. If ignore=true all missing items in A will be ignored. The default value is false.
 • output : default or quantity where quantity is any of minimum, lowerhinge, median, upperhinge, or maximum, indicates which quantities need be calculated. The value of this option can also be a list. In this case the FivePointSummary command will return a list of the specified quantities in the specified order.
 • summarize : false or embed; Display an embedded summary table. The default is false.
 • tableweights : list(integer); Relative weights for the Table's columns' widths. By default all columns have equal weight.
 • weights : Vector of data weights. The number of elements in the weights array must be equal to the number of elements in the original data sample. By default, all elements in A are assigned weight $1$.

Description

 • The FivePointSummary command computes the minimum, lower hinge, median, upper hinge and the maximum of a data sample. By default, the FivePointSummary command returns a column vector of equations of the form quantity=value where quantity is one of minimum, lowerhinge, median, upperhinge, or maximum.
 • The first parameter A is a data set (e.g., a Vector) or a Matrix data set.

Computation

 • All computations involving data are performed in floating-point; therefore, all data provided must have type realcons and all returned solutions are floating-point, even if the problem is specified with exact values.

Examples

 > $\mathrm{with}\left(\mathrm{Statistics}\right):$
 > $X≔\mathrm{RandomVariable}\left(\mathrm{Normal}\left(10,3\right)\right):$
 > $A≔\mathrm{Sample}\left(X,{10}^{4}\right):$

The FivePointSummary command returns a Vector containing the summary statistics.

 > $\mathrm{FivePointSummary}\left(A\right)$
 $\left[\begin{array}{c}\mathrm{minimum}=-1.6116642566301707\\ \mathrm{lowerhinge}=7.915108301856062\\ \mathrm{median}=9.943061233370731\\ \mathrm{upperhinge}=11.916576893244121\\ \mathrm{maximum}=20.78301180186928\end{array}\right]$ (1)
 > $\mathrm{FivePointSummary}\left(A,\mathrm{output}=\left[\mathrm{lowerhinge},\mathrm{upperhinge}\right]\right)$
 $\left[{7.91510830185606}{,}{11.9165768932441}\right]$ (2)

Note the difference between the quantities computed by the FivePointSummary and the Quantile commands.

 > $\mathrm{FivePointSummary}\left(A,\mathrm{output}=\left[\mathrm{lowerhinge},\mathrm{median},\mathrm{upperhinge}\right]\right)$
 $\left[{7.91510830185606}{,}{9.94306123337073}{,}{11.9165768932441}\right]$ (3)
 > $\mathrm{Quantile}\left(A,\left[0.25,0.5,0.75\right]\right)$
 $\left[{7.91509444943781}{,}{9.94306123337073}{,}{11.9168353743767}\right]$ (4)

Consider the following Matrix data set.

 > $M≔\mathrm{Matrix}\left(\left[\left[3,1130,114694\right],\left[4,1527,127368\right],\left[3,907,88464\right],\left[2,878,96484\right],\left[4,995,128007\right]\right]\right)$
 $\left[\begin{array}{rrr}3& 1130& 114694\\ 4& 1527& 127368\\ 3& 907& 88464\\ 2& 878& 96484\\ 4& 995& 128007\end{array}\right]$ (5)

For Matrix input, the FivePointSummary command outputs a Vector containing the corresponding summary statistics by column.

 > $\mathrm{results}≔\mathrm{FivePointSummary}\left(M\right)$
 $\left[\begin{array}{ccc}\left[\begin{array}{c}\mathrm{minimum}=2.0\\ \mathrm{lowerhinge}=3.0\\ \mathrm{median}=3.0\\ \mathrm{upperhinge}=4.0\\ \mathrm{maximum}=4.0\end{array}\right]& \left[\begin{array}{c}\mathrm{minimum}=878.0\\ \mathrm{lowerhinge}=907.0\\ \mathrm{median}=995.0\\ \mathrm{upperhinge}=1130.0\\ \mathrm{maximum}=1527.0\end{array}\right]& \left[\begin{array}{c}\mathrm{minimum}=88464.0\\ \mathrm{lowerhinge}=96484.0\\ \mathrm{median}=1.146940{}{10}^{5}\\ \mathrm{upperhinge}=1.273680{}{10}^{5}\\ \mathrm{maximum}=1.280070{}{10}^{5}\end{array}\right]\end{array}\right]$ (6)

To display the summary for one of the columns:

 > $\mathrm{results}\left[1\right]$
 $\left[\begin{array}{c}\mathrm{minimum}=2.0\\ \mathrm{lowerhinge}=3.0\\ \mathrm{median}=3.0\\ \mathrm{upperhinge}=4.0\\ \mathrm{maximum}=4.0\end{array}\right]$ (7)

If the input is a DataFrame object, then the result is a DataFrame that has the same column labels as the original input, and the row labels correspond to the output quantities requested.

 > $\mathrm{df}≔\mathrm{DataFrame}\left(M,\mathrm{columns}=\left[a,b,c\right]\right)$
 ${\mathrm{DataFrame}}{}\left(\left[\begin{array}{rrr}3& 1130& 114694\\ 4& 1527& 127368\\ 3& 907& 88464\\ 2& 878& 96484\\ 4& 995& 128007\end{array}\right]{,}{\mathrm{rows}}{=}\left[{1}{,}{2}{,}{3}{,}{4}{,}{5}\right]{,}{\mathrm{columns}}{=}\left[{a}{,}{b}{,}{c}\right]\right)$ (8)
 > $\mathrm{df_results}≔\mathrm{FivePointSummary}\left(\mathrm{df}\right)$
 ${\mathrm{DataFrame}}{}\left(\left[\begin{array}{ccc}2.0& 878.0& 88464.0\\ 3.0& 907.0& 96484.0\\ 3.0& 995.0& 1.146940{}{10}^{5}\\ 4.0& 1130.0& 1.273680{}{10}^{5}\\ 4.0& 1527.0& 1.280070{}{10}^{5}\end{array}\right]{,}{\mathrm{rows}}{=}\left[{\mathrm{minimum}}{,}{\mathrm{lowerhinge}}{,}{\mathrm{median}}{,}{\mathrm{upperhinge}}{,}{\mathrm{maximum}}\right]{,}{\mathrm{columns}}{=}\left[{a}{,}{b}{,}{c}\right]\right)$ (9)
 > $\mathrm{df_results}\left[b\right]$
 ${\mathrm{DataSeries}}{}\left(\left[\begin{array}{ccccc}878.0& 907.0& 995.0& 1130.0& 1527.0\end{array}\right]{,}{\mathrm{labels}}{=}\left[{\mathrm{minimum}}{,}{\mathrm{lowerhinge}}{,}{\mathrm{median}}{,}{\mathrm{upperhinge}}{,}{\mathrm{maximum}}\right]{,}{\mathrm{datatype}}{=}{\mathrm{anything}}\right)$ (10)

The summarize option makes it possible to display an embedded table containing the results. Note that the embedded table is only for display and that the returned value of the FivePointSummary command is unchanged.

 > $\mathrm{results}≔\mathrm{FivePointSummary}\left(\mathrm{df},\mathrm{summarize}=\mathrm{embed}\right):$

 a b c minimum ${2.}$ ${878.}$ ${88464.}$ lowerhinge ${3.}$ ${907.}$ ${96484.}$ median ${3.}$ ${995.}$ ${114694.}$ upperhinge ${4.}$ ${1130.}$ ${127368.}$ maximum ${4.}$ ${1527.}$ ${128007.}$

Similar to the previous example, the returned value for results is the same:

 > $\mathrm{results}\left[1\right]$
 ${\mathrm{DataSeries}}{}\left(\left[\begin{array}{ccccc}2.0& 3.0& 3.0& 4.0& 4.0\end{array}\right]{,}{\mathrm{labels}}{=}\left[{\mathrm{minimum}}{,}{\mathrm{lowerhinge}}{,}{\mathrm{median}}{,}{\mathrm{upperhinge}}{,}{\mathrm{maximum}}\right]{,}{\mathrm{datatype}}{=}{\mathrm{anything}}\right)$ (11)

The tableweights option controls the width of columns in an embedded table.

 > $\mathrm{FivePointSummary}\left(\mathrm{df},\mathrm{summarize}=\mathrm{embed},\mathrm{tableweights}=\left[4,2,2,2\right]\right):$

 a b c minimum ${2.}$ ${878.}$ ${88464.}$ lowerhinge ${3.}$ ${907.}$ ${96484.}$ median ${3.}$ ${995.}$ ${114694.}$ upperhinge ${4.}$ ${1130.}$ ${127368.}$ maximum ${4.}$ ${1527.}$ ${128007.}$

 > 

References

 Stuart, Alan, and Ord, Keith. Kendall's Advanced Theory of Statistics. 6th ed. London: Edward Arnold, 1998. Vol. 1: Distribution Theory.

Compatibility

 • The data parameter was updated in Maple 16.
 • The summarize option was introduced in Maple 2016.