TimeSeriesAnalysis[Join]  Join time series together

Calling Sequence


Join(ts1, ts2, ..., opts)
ts1 union ts2 union ...
<ts1, ts2, ...>
<ts1  ts2  ...>


Parameters


ts1, ts2, ...



TimeSeries data structures

opts



(optional) equations of the form keyword = value





Description


•

The Join command takes multiple TimeSeries data structures and stores them in a single, new, TimeSeries data structure.

•

The Join command can also be invoked in a few other ways:

Command (2D input)

Command (1D input)

Is equivalent to


ts1 union ts2 union ts3

Join(ts1, ts2, ts3)


<ts1, ts2, ts3>

Join(ts1, ts2, ts3)


<ts1  ts2  ts3>

Join(ts1, ts2, ts3, mergedatasets = false)




These alternative calling sequences provide no new functionality, just notational convenience.

•

All calling sequences described above accept any number of time series to be joined together. The command first determines what the data sets in the new time series object are going to be (see the discussion of the mergedatasets option, below). It then determines what times will be present in the new time series (see the discussion of the regulardates option, below). Finally, a Matrix encompassing all data is constructed, which is turned into a TimeSeries data structure.



Options


•

mergedatasets = false, true, or force


This option determines how Join decides how the data sets will occur in the new time series object. If mergedatasets = false is passed, then every data set in the input time series will be a separate data set in the resulting time series. If mergedatasets = true (the default) then all data sets that have equal names (given by the headers option; see GetHeaders) will be merged. This is the default. If mergedatasets = force is given, then all first data sets in every time series will be merged together; all second data sets (if present) will be merged, and so on.


If some data sets are merged, then the resulting data set gets the name of the data set occurring first in the calling sequence. If there are multiple values for one time stamp for the same data set, then the valuetolerance option determines what happens.

•

valuetolerance = nonnegative real number or infinity


Whenever data sets are merged, there can be a range of dates where more than one data set has a value. If this is the case and the values differ by more than the value of the valuetolerance parameter (in absolute sense), then an error is issued. Otherwise, values from data sets in earlier arguments (further to the left in the calling sequence) will be overwritten by values from data sets in later arguments (further to the right in the calling sequence). The default value is 0, meaning that values differing in any way yield an error. If specifying infinity as the value for this option, values will always be overwritten without raising an error.


When one of the sets has missing data for a given date, it is not considered for this process.

•

regulardates = true or false


If the dates of two time series do not match up exactly, then Maple needs to make a choice: either include multiple slightly different dates, or move values from one or more time series to a slightly different date. If regulardates = true (the default), then Maple tries to determine a set of time stamps that are mostly regular  i.e., they have intervals of similar length between them. In particular, it finds the time series with the shortest intervals on average and tries to make all time series line up around multiples of that interval. This means, for example, that if monthly data is joined to quarterly data, that monthly data points will always be inserted in between the quarterly data, even if the monthly data has a more limited range. If the time stamps do not exactly line up, the datetolerance option, explained below, determines what difference between them is accepted.


If regulardates = false, then Maple simply tries to find relatively few time stamps that occur near the time stamps in the given time series. (How near is determined by the datetolerance option below.) This can be appropriate if you need to join weekly and monthly data, for example. Note, however, that most commands in the TimeSeriesAnalysis package assume that dates are regular.

•

datetolerance = nonnegative real number


This option, together with the regulardates option, determines what are the time stamps of the time series to be constructed. In particular, if one time series has time stamps on the last day of every month, and another has time stamps on the 28th of every month, one could include both dates and either insert alternating missing values in between, or interpolate  but it seems more likely that, say, January 28th and January 31st can be considered the same time point. The datetolerance option determines when dates are merged. The way it works is as follows: given a sequence of dates, Maple determines a time interval for every date , where is typically equal to . It then ensures that for every , there is a corresponding time stamp in the new time series such that is in the range . If regulardates is true, then an error is raised if this is not possible within the constraints given. If regulardates is false, then extra time stamps are simply inserted.


Setting datetolerance=0 means different time stamps are never merged. If regulardates is false, then setting datetolerance equal to 1 or higher means that multiple dates from the same time series will have overlapping dates and is not recommended. (With time intervals that are not equally long, such as months, this is even possible for values of datetolerance that are a little less than 1.) If regulardates is true, this cannot happen. The default value is .

•

interpolate = none, nearest, or linear


If dates are inserted in between values for a time series, then by default the result will not have values coming from that time series. This option offers the opportunity to insert values obtained by interpolation.



Compatibility


•

The TimeSeriesAnalysis[Join] command was introduced in Maple 18.



Examples


>


Consider the following time series.
>


 (1) 
>


 (2) 
This is what the data from the time series looks like.
>


Trying to merge the data will not work without specifying some options, because there are different values for the same time.
>


By specifying the value tolerance, we can complete the join successfully.
>


 (3) 
>


The dates for both time series are not exactly the same; always has a data point at the last day of the month, and on the 30th. (In this case, there is no data for February, but if there were it would be on the last day of February.) So the fact that we get single data points makes use of the fact that the date tolerance is nonzero. If we set the date tolerance to 0, we get an error if we do not specify :
>


If we do specify , we will get separate points for both input time series at the end of October, but the date in September will be the 30th for both. We specify the time series in the other order, so that the value of overwrite the one for for September 30th.
>


 (4) 
>


 (5) 
>


We can force merging the first data set in each of and .
>


 (6) 
>


We can also force viewing the data sets as separate. This can be done by including the mergedatasets = false option, or using the following calling sequence:
>


 (7) 
>


Using the calling sequences or (equivalently) will not work, for the same reason that the first call to Join above did not work: there are conflicting values that cannot be merged. But the following time series can be merged with .
>


 (8) 
>


 (9) 
>


 (10) 
>


>


 (11) 
>


The time series and are the same. (They could also be obtained as .)
If we have a time series with weekly data, it is not straightforward how to join it with one with monthly data  even if the time series are not merged. This is because the dates do not line up well. In particular, we will need to use the regulardates = false option. This still leaves the issue of what happens in the monthly data set at the time stamps inserted for the weekly data set. There are a few options: the default one is to make each data set have missing values where only the other data sets have values.
>


 (12) 
>


>


 (13) 
>


The other option is to insert values at the new dates whenever there are two adjacent values in the original data set. This can be done by copying the nearest original data value, or by linear interpolation.
>


 (14) 
>


>


 (15) 
>


>



