Statistics - Maple Programming Help

Home : Support : Online Help : Graphics : Statistics : Statistics/Biplot

Statistics

 Biplot
 generate biplots

 Calling Sequence Biplot(dataset, options, plotoptions)

Parameters

 dataset - data set, DataFrame, or PCArecord options - (optional) equation(s) of the form option=value where option is one of arrows, arrowlabels, components, dimension, pcbiplot, points, pointlabels, or scale; specify options for generating the biplot plotoptions - options to be passed to the plots[display] command

Options

The options argument can contain one or more of the options shown below. All unrecognized options will be passed to the plots[display] command. See plot[options] for details.

 • arrows : truefalse or list; controls the display of arrows corresponding to each principal component. The default is true. If the arrows option is given as a list, the arrows are shown and any elements of the list are passed as plot options to the arrow constructor.
 • arrowlabels : truefalse or list; specifies the labels shown on the arrows corresponding to each column of the data. The default is true. If the dataset is a DataFrame, then the biplot will automatically use the column names from the dataframe as labels. If the dataset is a Matrix, then the arrowlabels must be provided as a list, otherwise no labels are shown. The default arrow labels can be overridden by specifying a list containing the new values.
 • components : list; specifies the principal components used in the biplot. By default, Biplot uses the first two principal components for 2-D plots and the first three principal components for 3-D plots. The default is [1,2].
 • dimension : integer; specifies the number of dimensions, either 2 or 3 of the resulting biplot. The default is 2.
 • pcbiplot : truefalse; controls if with lambda = 1, observations are scaled up by $\sqrt{n}$ and variables are scaled down by $\sqrt{n}$. This is referred to as a "principal component biplot", Gabriel (1971).
 • points : truefalse or list; controls the display of points corresponding to the individual rows of the principal components. The default is true. If the points option is given as a list, the points are shown and any elements of the list are passed as plot options to the plot constructor.
 • pointlabels : truefalse or list; controls the display of point labels. The default is false. If the dataset is a DataFrame, the row names from the DataFrame are used. If the dataset is a Matrix, the numbers 1 through $n$ are used, where $n$ is the number of rows of the Matrix. The default point labels can be overridden by specifying a list containing the new values.
 • scale : numeric value between 0 and 1; controls if the variables are scaled by ${\mathrm{\lambda }}^{\mathrm{scale}}$ and the observations are scaled by ${\mathrm{\lambda }}^{1-\mathrm{scale}}$, where lambda are the singular values computed by the principal component analysis. The default is 1.

Description

 • The Biplot command generates a biplot for the specified set of data. A biplot is a method of data visualization suitable for the results of a principal components analysis.
 • The first parameter, dataset, can be a numeric Matrix or DataFrame with 2 or more columns, or a record generated by a principal component analysis. In the case that dataset is either a Matrix or a DataFrame, a principal component analysis is run on the dataset and the results are used for the biplot.

Examples

 > $\mathrm{with}\left(\mathrm{Statistics}\right):$

Generate a biplot for the Iris dataset.

 > $\mathrm{IrisDF}≔\mathrm{Import}\left("datasets/iris.csv",\mathrm{base}=\mathrm{datadir}\right)$
 ${\mathrm{IrisDF}}{≔}\left[\begin{array}{cccccc}{}& {\mathrm{Sepal Length}}& {\mathrm{Sepal Width}}& {\mathrm{Petal Length}}& {\mathrm{Petal Width}}& {\mathrm{Species}}\\ {1}& {5.1}& {3.5}& {1.4}& {0.2}& {"setosa"}\\ {2}& {4.9}& {3}& {1.4}& {0.2}& {"setosa"}\\ {3}& {4.7}& {3.2}& {1.3}& {0.2}& {"setosa"}\\ {4}& {4.6}& {3.1}& {1.5}& {0.2}& {"setosa"}\\ {5}& {5}& {3.6}& {1.4}& {0.2}& {"setosa"}\\ {6}& {5.4}& {3.9}& {1.7}& {0.4}& {"setosa"}\\ {7}& {4.6}& {3.4}& {1.4}& {0.3}& {"setosa"}\\ {8}& {5}& {3.4}& {1.5}& {0.2}& {"setosa"}\\ {9}& {4.4}& {2.9}& {1.4}& {0.2}& {"setosa"}\\ {10}& {4.9}& {3.1}& {1.5}& {0.1}& {"setosa"}\\ {\mathrm{:}}& {\mathrm{:}}& {\mathrm{:}}& {\mathrm{:}}& {\mathrm{:}}& {"150 x 5 DataFrame"}\end{array}\right]$ (1)
 > $\mathrm{pca}≔\mathrm{PCA}\left({\mathrm{IrisDF}}_{\left[\mathrm{Sepal Length},\mathrm{Sepal Width},\mathrm{Petal Length},\mathrm{Petal Width}\right]}\right):$

A Biplot can also be used to show the first two components and the observations on the same diagram. The first principal component is plotted on the x-axis and the second on the y-axis.

 > $\mathrm{Biplot}\left(\mathrm{pca},\mathrm{size}=\left[600,"golden"\right]\right)$ From the biplot, it can be observed that petal width and length are highly correlated and their variability can be primarily attributed to the first component. Likewise, the first component also explains a large part of the Sepal length. The variability in sepal width is more attributed to the second component.

It is also possible to generate a biplot displaying other principal components using the components option. For example, here is a plot of the third and fourth principal components:

 > $\mathrm{Biplot}\left(\mathrm{pca},\mathrm{components}=\left[3..4\right],\mathrm{scale}=0.5\right)$ It is possible to view the first three components using the dimension option. Also, the colorscheme option applies different colors based on the various levels in the "Species" column.

 > $\mathrm{Biplot}\left(\mathrm{pca},\mathrm{dimension}=3,\mathrm{points}=\left[\mathrm{colorscheme}=\left["valuesplit",{\mathrm{IrisDF}}_{\mathrm{Species}}\right]\right],\mathrm{lightmodel}=\mathrm{none},\mathrm{orientation}=\left[-50,50,0\right]\right)$ The canada_crimes.csv dataset contains information on types of crimes committed per 100000 people:

 > $\mathrm{CCdata}≔\mathrm{Import}\left("datasets/canada_crimes.csv",\mathrm{base}=\mathrm{datadir}\right)$
 ${\mathrm{CCdata}}{≔}\left[\begin{array}{cccccc}{}& {\mathrm{Violent Crime}}& {\mathrm{Property Crime}}& {\mathrm{Other Criminal Code}}& {\mathrm{Criminal Code Traffic}}& {\mathrm{Federal Statute}}\\ {\mathrm{Newfoundland and Labrador}}& {1276.15}& {3317.03}& {1010.67}& {348.97}& {267.94}\\ {\mathrm{Prince Edward Island}}& {824.43}& {3294.3}& {572.18}& {348.64}& {215.34}\\ {\mathrm{Nova Scotia}}& {1241.05}& {3307.85}& {902.76}& {368.42}& {375.11}\\ {\mathrm{New Brunswick}}& {1164.32}& {2611.17}& {712.02}& {298.71}& {283.45}\\ {\mathrm{Quebec}}& {940.52}& {2100.84}& {450.29}& {511.18}& {314.74}\\ {\mathrm{Ontario}}& {786.62}& {2292.66}& {476.48}& {211.57}& {258.15}\\ {\mathrm{Manitoba}}& {1712.97}& {4311.48}& {1689.72}& {276.28}& {362.78}\\ {\mathrm{Saskatchewan}}& {1963.46}& {5627.55}& {2913.78}& {886.34}& {692.9}\\ {\mathrm{Alberta}}& {1243.83}& {4308.67}& {1497.54}& {466.12}& {371.43}\\ {\mathrm{British Columbia}}& {1148.42}& {4886.1}& {1564.03}& {350.85}& {682.27}\\ {\mathrm{:}}& {\mathrm{:}}& {\mathrm{:}}& {\mathrm{:}}& {\mathrm{:}}& {"13 x 5 DataFrame"}\end{array}\right]$ (2)

The pointlabels option controls if the points in the biplot include labels or not. Additional options such as axes or size are passed to the plots:-display command.

 > $\mathrm{Biplot}\left(\mathrm{PCA}\left(\mathrm{CCdata},\mathrm{scale}=\mathrm{true}\right),\mathrm{points}=\mathrm{false},\mathrm{pointlabels}=\mathrm{true},\mathrm{arrows}=\left[\mathrm{color}="Crimson"\right],\mathrm{axes}=\mathrm{normal},\mathrm{size}=\left[800,"golden"\right],\mathrm{view}=\left[-1..1,-0.5..0.5\right]\right)$ > 

References

 Gabriel, K.R. (1971). The biplot graphical display of matrices with applications to principal component analysis. Biometrika, 58, 453-467.

Compatibility

 • The Statistics[Biplot] command was introduced in Maple 2016.