TimeSeriesAnalysis - Maple Programming Help

Home : Support : Online Help : Statistics and Data Analysis : Time Series Analysis Package : TimeSeriesAnalysis/Join

TimeSeriesAnalysis

 Join
 Join time series together

 Calling Sequence Join(ts1, ts2, ..., opts) $\mathrm{ts1}\cup \mathrm{ts2}\cup \mathrm{...}$ ts1 union ts2 union ...

Parameters

 ts1, ts2, ... - TimeSeries data structures opts - (optional) equations of the form keyword = value

Description

 • The Join command takes multiple TimeSeries data structures and stores them in a single, new, TimeSeries data structure.
 • The Join command can also be invoked in a few other ways:

 Command (2D input) Command (1D input) Is equivalent to $\mathrm{ts1}\cup \mathrm{ts2}\cup \mathrm{ts3}$ ts1 union ts2 union ts3 Join(ts1, ts2, ts3) $⟨\mathrm{ts1},\mathrm{ts2},\mathrm{ts3}⟩$ Join(ts1, ts2, ts3) $⟨\mathrm{ts1}|\mathrm{ts2}|\mathrm{ts3}⟩$ Join(ts1, ts2, ts3, mergedatasets = false)

 These alternative calling sequences provide no new functionality, just notational convenience.
 • All calling sequences described above accept any number of time series to be joined together. The command first determines what the data sets in the new time series object are going to be (see the discussion of the mergedatasets option, below). It then determines what times will be present in the new time series (see the discussion of the regulardates option, below). Finally, a Matrix encompassing all data is constructed, which is turned into a TimeSeries data structure.

Options

 • mergedatasets = false, true, or force
 This option determines how Join decides how the data sets will occur in the new time series object. If mergedatasets = false is passed, then every data set in the input time series will be a separate data set in the resulting time series. If mergedatasets = true (the default) then all data sets that have equal names (given by the headers option; see GetHeaders) will be merged. This is the default. If mergedatasets = force is given, then all first data sets in every time series will be merged together; all second data sets (if present) will be merged, and so on.
 If some data sets are merged, then the resulting data set gets the name of the data set occurring first in the calling sequence. If there are multiple values for one time stamp for the same data set, then the valuetolerance option determines what happens.
 • valuetolerance = nonnegative real number or infinity
 Whenever data sets are merged, there can be a range of dates where more than one data set has a value. If this is the case and the values differ by more than the value of the valuetolerance parameter (in absolute sense), then an error is issued. Otherwise, values from data sets in earlier arguments (further to the left in the calling sequence) will be overwritten by values from data sets in later arguments (further to the right in the calling sequence). The default value is 0, meaning that values differing in any way yield an error. If specifying infinity as the value for this option, values will always be overwritten without raising an error.
 When one of the sets has missing data for a given date, it is not considered for this process.
 • regulardates = true or false
 If the dates of two time series do not match up exactly, then Maple needs to make a choice: either include multiple slightly different dates, or move values from one or more time series to a slightly different date. If regulardates = true (the default), then Maple tries to determine a set of time stamps that are mostly regular - i.e., they have intervals of similar length between them. In particular, it finds the time series with the shortest intervals on average and tries to make all time series line up around multiples of that interval. This means, for example, that if monthly data is joined to quarterly data, that monthly data points will always be inserted in between the quarterly data, even if the monthly data has a more limited range. If the time stamps do not exactly line up, the datetolerance option, explained below, determines what difference between them is accepted.
 If regulardates = false, then Maple simply tries to find relatively few time stamps that occur near the time stamps in the given time series. (How near is determined by the datetolerance option below.) This can be appropriate if you need to join weekly and monthly data, for example. Note, however, that most commands in the TimeSeriesAnalysis package assume that dates are regular.
 • datetolerance = nonnegative real number
 This option, together with the regulardates option, determines what are the time stamps of the time series to be constructed. In particular, if one time series has time stamps on the last day of every month, and another has time stamps on the 28th of every month, one could include both dates and either insert alternating missing values in between, or interpolate - but it seems more likely that, say, January 28th and January 31st can be considered the same time point. The datetolerance option determines when dates are merged. The way it works is as follows: given a sequence ${t}_{i}$ of dates, Maple determines a time interval ${t}_{i}-\mathrm{\delta }..{t}_{i}+\mathrm{\delta }$ for every date ${t}_{i}$, where $\mathrm{\delta }$ is typically equal to $\frac{\left({t}_{i+1}-{t}_{i-1}\right)\mathrm{datetolerance}}{2}$. It then ensures that for every ${t}_{i}$, there is a corresponding time stamp ${T}_{k}$ in the new time series such that ${T}_{k}$ is in the range ${t}_{i}-\mathrm{\delta }..{t}_{i}+\mathrm{\delta }$. If regulardates is true, then an error is raised if this is not possible within the constraints given. If regulardates is false, then extra time stamps are simply inserted.
 Setting datetolerance=0 means different time stamps are never merged. If regulardates is false, then setting datetolerance equal to 1 or higher means that multiple dates from the same time series will have overlapping dates and is not recommended. (With time intervals that are not equally long, such as months, this is even possible for values of datetolerance that are a little less than 1.) If regulardates is true, this cannot happen. The default value is $0.05$.
 • interpolate = none, nearest, or linear
 If dates are inserted in between values for a time series, then by default the result will not have values coming from that time series. This option offers the opportunity to insert values obtained by interpolation.
 More precisely, suppose a data set has values ${x}_{k}$ and ${x}_{k+1}$ for dates ${t}_{k}$ and ${t}_{k+1}$. These are mapped to times ${T}_{m}$ and ${T}_{m+n}$, with $1. For times ${T}_{m+i}$ with $0 and $i, the default option, interpolate = none, will not supply any values for the resulting data set. If the option interpolate = nearest is supplied, then the resulting data set's value for ${T}_{m+i}$ will get the value for ${t}_{k}$ if $i<\frac{n}{2}$ and the value for ${t}_{k+1}$ otherwise. If the option interpolate = linear is supplied, then the value will be determined by linear interpolation: the value for time ${T}_{m+i}$ is ${x}_{k}+\frac{i\left({x}_{k+1}-{x}_{k}\right)}{n}$.

Examples

 > $\mathrm{with}\left(\mathrm{TimeSeriesAnalysis}\right):$

Consider the following time series.

 > $\mathrm{ts1}≔\mathrm{TimeSeries}\left(\left[\left[5.2,4.8,2.9,3.9,4.3,4.3,3.1\right],\left[16.3,19.1,15.6,18.2,\mathrm{undefined},17.7,19.6\right]\right],\mathrm{headers}=\left["A","B"\right],\mathrm{frequency}="monthly",\mathrm{enddate}="2011-10-30"\right)$
 ${\mathrm{ts1}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {\mathrm{A, B}}\\ {\mathrm{7 rows of data:}}\\ {\mathrm{2011-04-30 - 2011-10-30}}\end{array}\right]$ (1)
 > $\mathrm{ts2}≔\mathrm{TimeSeries}\left(\left[17.8,19.9,21.2,22.5,23.9\right],\mathrm{header}="B",\mathrm{enddate}="2012-01-31",\mathrm{frequency}="monthly"\right)$
 ${\mathrm{ts2}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {B}\\ {\mathrm{5 rows of data:}}\\ {\mathrm{2011-09-30 - 2012-01-31}}\end{array}\right]$ (2)

This is what the data from the time series looks like.

 > $\mathrm{TimeSeriesPlot}\left(\mathrm{ts1},\mathrm{ts2}\right)$

Trying to merge the data will not work without specifying some options, because there are different values for the same time.

 > $\mathrm{ts3}≔\mathrm{Join}\left(\mathrm{ts1},\mathrm{ts2}\right)$

By specifying the value tolerance, we can complete the join successfully.

 > $\mathrm{ts3}≔\mathrm{Join}\left(\mathrm{ts1},\mathrm{ts2},\mathrm{valuetolerance}=0.5\right)$
 ${\mathrm{ts3}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {\mathrm{A, B}}\\ {\mathrm{10 rows of data:}}\\ {\mathrm{2011-04-30 - 2012-01-31}}\end{array}\right]$ (3)
 > $\mathrm{TimeSeriesPlot}\left(\mathrm{ts3}\right)$

The dates for both time series are not exactly the same; $\mathrm{ts2}$ always has a data point at the last day of the month, and $\mathrm{ts1}$ on the 30th. (In this case, there is no data for February, but if there were it would be on the last day of February.) So the fact that we get single data points makes use of the fact that the date tolerance is nonzero. If we set the date tolerance to 0, we get an error if we do not specify $\mathrm{regulardates}=\mathrm{false}$:

 > $\mathrm{ts4}≔\mathrm{Join}\left(\mathrm{ts1},\mathrm{ts2},\mathrm{datetolerance}=0,\mathrm{valuetolerance}=0.5\right)$

If we do specify $\mathrm{regulardates}=\mathrm{false}$, we will get separate points for both input time series at the end of October, but the date in September will be the 30th for both. We specify the time series in the other order, so that the value of $\mathrm{ts1}$ overwrite the one for $\mathrm{ts2}$ for September 30th.

 > $\mathrm{ts4}≔\mathrm{Join}\left(\mathrm{ts2},\mathrm{ts1},\mathrm{valuetolerance}=0.5,\mathrm{datetolerance}=0,\mathrm{regulardates}=\mathrm{false}\right)$
 ${\mathrm{ts4}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {\mathrm{B, A}}\\ {\mathrm{11 rows of data:}}\\ {\mathrm{2011-04-30 - 2012-01-31}}\end{array}\right]$ (4)
 > $\mathrm{GetDates}\left(\mathrm{ts4}\right)$
 $\left[\begin{array}{c}{"2011-04-30"}\\ {"2011-05-30"}\\ {"2011-06-30"}\\ {"2011-07-30"}\\ {"2011-08-30"}\\ {"2011-09-30"}\\ {"2011-10-30"}\\ {"2011-10-31"}\\ {"2011-11-30"}\\ {"2011-12-31"}\\ {"11 element Vector\left[column\right]"}\end{array}\right]$ (5)
 > $\mathrm{TimeSeriesPlot}\left(\mathrm{ts4}\right)$

We can force merging the first data set in each of $\mathrm{ts1}$ and $\mathrm{ts2}$.

 > $\mathrm{ts5}≔\mathrm{Join}\left(\mathrm{ts1},\mathrm{ts2},\mathrm{mergedatasets}=\mathrm{force},\mathrm{valuetolerance}=\mathrm{∞}\right)$
 ${\mathrm{ts5}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {\mathrm{A, B}}\\ {\mathrm{10 rows of data:}}\\ {\mathrm{2011-04-30 - 2012-01-31}}\end{array}\right]$ (6)
 > $\mathrm{TimeSeriesPlot}\left(\mathrm{ts5}\right)$

We can also force viewing the data sets as separate. This can be done by including the mergedatasets = false option, or using the following calling sequence:

 > $\mathrm{ts6}≔⟨\mathrm{ts1}|\mathrm{ts2}⟩$
 ${\mathrm{ts6}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {\mathrm{A, B, B 1}}\\ {\mathrm{10 rows of data:}}\\ {\mathrm{2011-04-30 - 2012-01-31}}\end{array}\right]$ (7)
 > $\mathrm{TimeSeriesPlot}\left(\mathrm{ts6}\right)$

Using the calling sequences $⟨\mathrm{ts1},\mathrm{ts2}⟩$ or (equivalently) $\mathrm{ts1}\cup \mathrm{ts2}$ will not work, for the same reason that the first call to Join above did not work: there are conflicting values that cannot be merged. But the following time series can be merged with $\mathrm{ts1}$.

 > $\mathrm{ts7}≔{\mathrm{ts2}}_{3..\left(\right)}$
 ${\mathrm{ts7}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {B}\\ {\mathrm{3 rows of data:}}\\ {\mathrm{2011-11-30 - 2012-01-31}}\end{array}\right]$ (8)
 > $\mathrm{ts8}≔\mathrm{TimeSeries}\left(\left[4.9,4.8,2.4,4.0,\mathrm{undefined},3.1,4.0\right],\mathrm{frequency}="monthly",\mathrm{startdate}="2011-11-30",\mathrm{header}="A"\right)$
 ${\mathrm{ts8}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {A}\\ {\mathrm{7 rows of data:}}\\ {\mathrm{2011-11-30 - 2012-05-30}}\end{array}\right]$ (9)
 > $\mathrm{ts9}≔⟨\mathrm{ts1},\mathrm{ts7},\mathrm{ts8}⟩$
 ${\mathrm{ts9}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {\mathrm{A, B}}\\ {\mathrm{14 rows of data:}}\\ {\mathrm{2011-04-30 - 2012-05-30}}\end{array}\right]$ (10)
 > $\mathrm{TimeSeriesPlot}\left(\mathrm{ts9}\right)$

 > $\mathrm{ts10}≔\left(\mathrm{ts1}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}∪\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}\mathrm{ts7}\right)\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}∪\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}\mathrm{ts8}$
 ${\mathrm{ts10}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {\mathrm{B, A}}\\ {\mathrm{14 rows of data:}}\\ {\mathrm{2011-04-30 - 2012-05-30}}\end{array}\right]$ (11)
 > $\mathrm{TimeSeriesPlot}\left(\mathrm{ts10}\right)$

The time series $\mathrm{ts9}$ and $\mathrm{ts10}$ are the same. (They could also be obtained as $\mathrm{Join}\left(\mathrm{ts1},\mathrm{ts7},\mathrm{ts8}\right)$.)

If we have a time series with weekly data, it is not straightforward how to join it with one with monthly data - even if the time series are not merged. This is because the dates do not line up well. In particular, we will need to use the regulardates = false option. This still leaves the issue of what happens in the monthly data set at the time stamps inserted for the weekly data set. There are a few options: the default one is to make each data set have missing values where only the other data sets have values.

 > $\mathrm{ts11}≔\mathrm{TimeSeries}\left(\left[8.7,10.1,10.9,9.4,10.7,9.5,9.6,12.2,9.1,9.8,11.7,11.7,10.9,10.6,10.8,11.2,\mathrm{undefined},10.2,10.6,\mathrm{undefined},9.8,10.8,10.5,10.1,9.5,9.4,10.4,11.5,9.4,12.1\right],\mathrm{startdate}="2011-11-01",\mathrm{frequency}="weekly",\mathrm{header}="C"\right)$
 ${\mathrm{ts11}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {C}\\ {\mathrm{30 rows of data:}}\\ {\mathrm{2011-11-01 - 2012-05-22}}\end{array}\right]$ (12)
 > $\mathrm{TimeSeriesPlot}\left(\mathrm{ts11}\right)$

 > $\mathrm{ts12}≔\mathrm{Join}\left(\mathrm{ts8},\mathrm{ts11},\mathrm{regulardates}=\mathrm{false}\right)$
 ${\mathrm{ts12}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {\mathrm{A, C}}\\ {\mathrm{33 rows of data:}}\\ {\mathrm{2011-11-01 - 2012-05-30}}\end{array}\right]$ (13)
 > $\mathrm{TimeSeriesPlot}\left(\mathrm{ts12}\right)$

The other option is to insert values at the new dates whenever there are two adjacent values in the original data set. This can be done by copying the nearest original data value, or by linear interpolation.

 > $\mathrm{ts13}≔\mathrm{Join}\left(\mathrm{ts8},\mathrm{ts11},\mathrm{interpolate}=\mathrm{nearest},\mathrm{regulardates}=\mathrm{false}\right)$
 ${\mathrm{ts13}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {\mathrm{A, C}}\\ {\mathrm{33 rows of data:}}\\ {\mathrm{2011-11-01 - 2012-05-30}}\end{array}\right]$ (14)
 > $\mathrm{TimeSeriesPlot}\left(\mathrm{ts13}\right)$

 > $\mathrm{ts14}≔\mathrm{Join}\left(\mathrm{ts8},\mathrm{ts11},\mathrm{interpolate}=\mathrm{linear},\mathrm{regulardates}=\mathrm{false}\right)$
 ${\mathrm{ts14}}{≔}\left[\begin{array}{c}{\mathrm{Time series}}\\ {\mathrm{A, C}}\\ {\mathrm{33 rows of data:}}\\ {\mathrm{2011-11-01 - 2012-05-30}}\end{array}\right]$ (15)
 > $\mathrm{TimeSeriesPlot}\left(\mathrm{ts14}\right)$

 > 

Compatibility

 • The TimeSeriesAnalysis[Join] command was introduced in Maple 18.