DataSets[Builtin] - Maple Programming Help

Home : Support : Online Help : Statistics and Data Analysis : DataSets Package : Builtin : DataSets/Builtin/Reference

DataSets[Builtin]

 Reference
 create a reference to built-in data

 Calling Sequence Reference(id) DataSets:-Reference("Builtin", id) ref := Reference(id) ref[rows] ref[rows, columns] ref[boolexps] ref[boolexps, columns]

Parameters

 id - string; identifier of the data set rows - row index or list of row indices, where each row index is a string, range, or integer columns - column index or list of column indices, where each column index is a string, name, range, or integer boolexps - boolean expression or list of boolean expressions

Description

 • The Reference command creates a reference to a built-in data set (that is, a country or city data set that comes with Maple).
 • The reference object retrieves data only when necessary, and caches data retrieved from non-local sources.
 • Reference objects can be created without this command with DataSets[Search] or DataSets[InsertSearchBox].
 • Time series data, including Quandl references, inside a built-in data reference will be displayed in a condensed form, showing only that it is a time series and its newest value. Other data types are displayed as normal.

Indexing

 • A reference object ref can be indexed to get a new reference representing a subset of the data, or just one entry in the data set.
 – A single entry from the data set is returned if and only if the row index is a string or integer, and the column index is a string, name, or integer.
 – Every other calling sequence will return a reference object.
 • Rows indices strings, integers, or ranges.
 – A string will select the row whose name equals that string. Row names are always unique within a data set.
 – Integers and ranges work similarly to indexing a Matrix, where positive integers or endpoints start at the first row and counts up to later rows, and negative integers or endpoints start at the last row and counts down to earlier rows.
 – Missing start and end endpoints in a range will be interpreted as $1$ and $-1$, respectively.
 – The ordering of the rows is not guaranteed. So using integers or ranges may not always select the same rows, even if two references contains the same set of rows.
 – Multiple row indices may be combined with a list to select the union of the rows selected by the individual indices. But note that selecting the same row more than once will result in an error.
 • Columns indices are strings, names, integers, or ranges.
 – A string or name will select the column whose name equals that string or name converted to a string. Column names are always unique within a data set.
 – The name column will remain the first column even if it does not appear among the indices. If name exists among the indices, it must be the first one.
 – Integers and ranges start from the second column, but otherwise works the same way as row indices. The name column cannot be selected with integers or ranges.
 – The ordering of the columns is not guaranteed except the row name column, which will always be the first column.
 – Multiple column indices may also be combined in the same way as row indices, and selecting the same column more than once will result in an error.
 • A second method to select rows is to filter by the values in the rows with a boolean expression or a list of boolean expressions.
 – The indeterminants in each boolean expression must be the names of the columns, as Maple names.
 – For each row, the indeterminants are substituted for the value at the corresponding column and at that row. If the boolean expression or the conjunction of the list of boolean expressions is satisfied, then that row is retained. Otherwise, that row is removed.
 – The newest value will be used for time series data. Other data are used normally.

Examples

 > $\mathrm{with}\left(\mathrm{DataSets}\right):$

Create a reference to a data set with all the countries in the world.

 > $\mathrm{ref}≔\mathrm{Reference}\left("Builtin","Country"\right)$
 ${\mathrm{ref}}{≔}\left[\begin{array}{ccccc}{\mathrm{Country \left(Name\right)}}& {\mathrm{CO2 Emissions}}& {\mathrm{CPI Change}}& {\dots }& {\mathrm{\left(124 more\right)}}\\ {\mathrm{Afghanistan}}& {\mathrm{TimeSeries 12251.447}}& {\mathrm{TimeSeries 7.6476076230378}}& {\dots }& {}\\ {\mathrm{Albania}}& {\mathrm{TimeSeries 4668.091}}& {\mathrm{TimeSeries 1.6285401210224}}& {\dots }& {}\\ {\mathrm{Algeria}}& {\mathrm{TimeSeries 121755.401}}& {\mathrm{TimeSeries 2.9164064126589}}& {\dots }& {}\\ {⋮}& {⋮}& {⋮}& {\ddots }& {}\\ {\mathrm{\left(182 more\right)}}& {}& {}& {}& {}\end{array}\right]$ (1)

Create the same reference with a different calling sequence.

 > $\mathrm{Builtin}:-\mathrm{Reference}\left("Country"\right)$
 $\left[\begin{array}{ccccc}{\mathrm{Country \left(Name\right)}}& {\mathrm{CO2 Emissions}}& {\mathrm{CPI Change}}& {\dots }& {\mathrm{\left(124 more\right)}}\\ {\mathrm{Afghanistan}}& {\mathrm{TimeSeries 12251.447}}& {\mathrm{TimeSeries 7.6476076230378}}& {\dots }& {}\\ {\mathrm{Albania}}& {\mathrm{TimeSeries 4668.091}}& {\mathrm{TimeSeries 1.6285401210224}}& {\dots }& {}\\ {\mathrm{Algeria}}& {\mathrm{TimeSeries 121755.401}}& {\mathrm{TimeSeries 2.9164064126589}}& {\dots }& {}\\ {⋮}& {⋮}& {⋮}& {\ddots }& {}\\ {\mathrm{\left(182 more\right)}}& {}& {}& {}& {}\end{array}\right]$ (2)

Select a subset of the data with the index operator.

 > ${\mathrm{ref}}_{2..4,2..4}$
 $\left[\begin{array}{cccc}{\mathrm{Country \left(Name\right)}}& {\mathrm{CPI Change}}& {\mathrm{Child Mortality}}& {\mathrm{Currency Code}}\\ {\mathrm{Albania}}& {\mathrm{TimeSeries 1.6285401210224}}& {\mathrm{TimeSeries 14.9}}& {"ALL"}\\ {\mathrm{Algeria}}& {\mathrm{TimeSeries 2.9164064126589}}& {\mathrm{TimeSeries 25.2}}& {"DZD"}\\ {\mathrm{Angola}}& {\mathrm{TimeSeries 7.2795615410538}}& {\mathrm{TimeSeries 167.4}}& {"AOA"}\end{array}\right]$ (3)
 > ${\mathrm{ref}}_{"Canada",\left[1,3\right]}$
 $\left[\begin{array}{ccc}{\mathrm{Country \left(Name\right)}}& {\mathrm{CO2 Emissions}}& {\mathrm{Child Mortality}}\\ {\mathrm{Canada}}& {\mathrm{TimeSeries 485463.129}}& {\mathrm{TimeSeries 5.2}}\end{array}\right]$ (4)
 > ${\mathrm{ref}}_{\left["USA",5\right],\left[1,3..4\right]}$
 $\left[\begin{array}{cccc}{\mathrm{Country \left(Name\right)}}& {\mathrm{CO2 Emissions}}& {\mathrm{Child Mortality}}& {\mathrm{Currency Code}}\\ {\mathrm{USA}}& {\mathrm{TimeSeries 5305569.614}}& {\mathrm{TimeSeries 6.9}}& {\mathrm{undefined}}\\ {\mathrm{Antigua and Barbuda}}& {\mathrm{TimeSeries 513.38}}& {\mathrm{TimeSeries 9.3}}& {"XCD"}\end{array}\right]$ (5)

Select just one entry in the data set.

 > ${\mathrm{ref}}_{"Canada",\mathrm{CPI Change}}$
 $\left[\begin{array}{c}{\mathrm{Data set}}\\ {\mathrm{Canada: Inflation, consumer prices \left(annual %\right)}}\\ {\mathrm{Quandl WORLDBANK/CAN_FP_CPI_TOTL_ZG}}\\ {\mathrm{up to 54 rows \left(annual\right), 1 column}}\\ {\mathrm{1961-12-31 - 2014-12-31}}\end{array}\right]$ (6)
 > ${\mathrm{ref}}_{2,4}$
 ${"ALL"}$ (7)

Convert to a Matrix.

 > $\mathrm{convert}\left({\mathrm{ref}}_{"Canada",\left[1,3\right]},'\mathrm{Matrix}'\right)$
 $\left[\begin{array}{ccc}{"Canada"}& \left[\begin{array}{c}{\mathrm{Data set}}\\ {\mathrm{Canada: CO2 emissions \left(kt\right)}}\\ {\mathrm{Quandl WORLDBANK/CAN_EN_ATM_CO2E_KT}}\\ {\mathrm{up to 51 rows \left(annual\right), 1 column}}\\ {\mathrm{1960-12-31 - 2010-12-31}}\end{array}\right]& \left[\begin{array}{c}{\mathrm{Data set}}\\ {\mathrm{Canada: Mortality rate, under-5 \left(per 1,000 live births\right)}}\\ {\mathrm{Quandl WORLDBANK/CAN_SH_DYN_MORT}}\\ {\mathrm{up to 54 rows \left(annual\right), 1 column}}\\ {\mathrm{1960-12-31 - 2013-12-31}}\end{array}\right]\end{array}\right]$ (8)

Using boolean expressions to filter.

 > $\mathrm{ref2}≔{\mathrm{ref}}_{1..5,\left["CPI Change","CO2 Emissions"\right]}$
 ${\mathrm{ref2}}{≔}\left[\begin{array}{ccc}{\mathrm{Country \left(Name\right)}}& {\mathrm{CPI Change}}& {\mathrm{CO2 Emissions}}\\ {\mathrm{Afghanistan}}& {\mathrm{TimeSeries 7.6476076230378}}& {\mathrm{TimeSeries 12251.447}}\\ {\mathrm{Albania}}& {\mathrm{TimeSeries 1.6285401210224}}& {\mathrm{TimeSeries 4668.091}}\\ {\mathrm{Algeria}}& {\mathrm{TimeSeries 2.9164064126589}}& {\mathrm{TimeSeries 121755.401}}\\ {⋮}& {⋮}& {⋮}\\ {\mathrm{\left(2 more\right)}}& {}& {}\end{array}\right]$ (9)
 > ${\mathrm{ref2}}_{5000<\mathrm{CO2 Emissions}}$
 $\left[\begin{array}{ccc}{\mathrm{Country \left(Name\right)}}& {\mathrm{CPI Change}}& {\mathrm{CO2 Emissions}}\\ {\mathrm{Afghanistan}}& {\mathrm{TimeSeries 7.6476076230378}}& {\mathrm{TimeSeries 12251.447}}\\ {\mathrm{Algeria}}& {\mathrm{TimeSeries 2.9164064126589}}& {\mathrm{TimeSeries 121755.401}}\\ {\mathrm{Angola}}& {\mathrm{TimeSeries 7.2795615410538}}& {\mathrm{TimeSeries 29710.034}}\end{array}\right]$ (10)

Compatibility

 • The DataSets[Builtin][Reference] command was introduced in Maple 2015.