Function to change time support of TimeIntervalDataFrame

Description

Methods that allows to agregate AND disagregate homogeneous AND heterogeneous time data.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
  changeSupport(from, to, min.coverage, FUN = NULL,
    weights.arg = NULL, split.from = FALSE,
    merge.from = TRUE, ...)

  ## S4 method for signature 'TimeIntervalDataFrame,POSIXctp,numeric'
changeSupport(from, to, min.coverage, FUN=NULL,
				 weights.arg=NULL, split.from=FALSE,
				 merge.from=TRUE, ...)
  ## S4 method for signature 
## 'TimeIntervalDataFrame,TimeIntervalDataFrame,numeric'
changeSupport(from, to, min.coverage,
				FUN=NULL, weights.arg=NULL,
				split.from=FALSE, merge.from=TRUE, ...)
  ## S4 method for signature 'TimeIntervalDataFrame,character,numeric'
changeSupport(from, to, min.coverage, FUN=NULL,
				 weights.arg=NULL, split.from=FALSE,
				 merge.from=TRUE, ...)

Arguments

from

TimeIntervalDataFrame for wich the time support is to change

to

an object indicating the new support, see specific sections

min.coverage

a numeric between 0 and 1 indicating the percentage of valid values over each interval to allow an aggregation. NA is returned if the percentage is not reach. In changeSupport, when values are aggregated, intervals are not allowed to overlap. When a function (FUN) has a na.rm argument, the na.rm=TRUE behaviour is met if na.rm is set to TRUE and min.coverage to 0 (zero) ; the na.rm=FALSE behaviour is met if na.rm is set to FALSE whatever is the value of min.coverage.

FUN

function use to agregate data of from. By default mean if ‘from’ is homogeneous. weighted.mean otherwise.

weights.arg

if FUN has a ‘weight’ argument, this parameter must be a character naming the weight argument. For instance, if FUN is weighted.mean, then weights.arg is 'w'.

...

arguments for FUN or for other methods

split.from

logical indicating if data in ‘from’ can be used for several intervals of the new time support (see ‘details’).

merge.from

logical indicating if data in ‘from’ can be merged over interval of the new time support.

Details

Agregating homogeneous data is for example to calculate daily means of time series from hourly time series.

Agregating heterogeneous data is for example to calculate annual means of time series from monthly time series (because each month doesn(t have identical weight).

In above cases, the min.coverage allows to control if means should be calculated or not : for the monthly case, if there are NA values and the time coverage of ‘not NA’ values is lower min.coverage the result will be NA ; if time coverage is higher than min.coverage, the annual mean will be ‘estimated’ by the mean of available data.

Disagregating data is more ‘articficial’ and is disabled by default (with the split.from argument). This argument is also used to precise if one value can be use for agregation in more than one interval in the resulting TimeIntervalDataFrame (for sliding intervals for instance). Here are some examples of time disagregration :

  • A weekly mean can be dispatched over the days of the week. By default, the value attribuated to each day is the value of the week, but this can be changed by using a special function (FUN argument).

  • The value of a variable is known from monday at 15 hours to tuesday at 15 hours and from tuesday at 15 hours to wednesday at 15 hours. To ‘evaluate’ the value of the variable for tuesday can be estimated by doing a weigthed mean between the two values. Weights are determined by the intersection between each interval and tuesday. Here weights will be 0.625 (15/24) and 0.375 (9/24) (In this case, disagration is combined with a ‘reagregation’).

These are ‘trivial’ examples but many other usage can be found for these methods. Other functions than weighted.mean or mean can be used. The Qair package (in its legislative part) gives several examples of usage (this package is not availables on CRAN but see ‘references’ to know where you can find it).

Value

TimeIntervalDataFrame

from=TimeIntervalDataFrame, to=TimeIntervalDataFrame

to is a TimeIntervalDataFrame. The method will try to adapt data of from over interval of to. The returned object is the to TimeIntervalDataFrame with new columns corresponding of those of from.

If merge.from is TRUE, values affected for each interval of to will be calculated with all data in the interval. If split.from is TRUE, values partially in the interval will also be used for calculation.

If merge.from is FALSE, values affected for each interval of to will be the one inside this interval. If several values are inside the interval, NA will be affected. If split.from is TRUE, a value partially inside the interval is considered as being inside it. So if there is no other values in the interval, this value will be affected, else NA will be affected.

from=TimeIntervalDataFrame, to=character

to is one of 'year', 'month', 'day', 'hour', 'minute' or 'second'. It defines the period (POSIXctp) to use to build the new TimeIntervalDataFrame on which from will be agregated (or disagregated).

So first, an ‘empty’ (no data) TimeIntervalDataFrame is created, and then, the agregation is done accordingly to the ‘from=TimeIntervalDataFrame, to=TimeIntervalDataFrame’ section.

from=TimeIntervalDataFrame, to=POSIXctp

to is period (see POSIXctp). It defines the base of the new TimeIntervalDataFrame on which from will be agregated (or disagregated).

So first, an ‘empty’ (no data) TimeIntervalDataFrame is created, and then, the agregation is done accordingly to the ‘from=TimeIntervalDataFrame, to=TimeIntervalDataFrame’ section.

References

Qair-package : http://sourceforge.net/projects/packagerqair/

See Also

TimeIntervalDataFrame, POSIXcti

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
ti3 <- TimeIntervalDataFrame (
       c('2010-01-01', '2010-01-02', '2010-01-04'), NULL,
       'UTC', data.frame(ex3=c(6, 1.5)))

# weighted mean over a period of 3 days with at least 75% of
# coverage (NA is retunr if not)
ti3
d <- POSIXctp(unit='day')
changeSupport (ti3, 3L*d, 0.75)

ti4 <- TimeIntervalDataFrame (
	c('2010-01-01', '2010-01-02', '2010-01-04',
	  '2010-01-07', '2010-01-09', '2010-01-10'), NULL,
         'UTC', data.frame(ex4=c(6, 1.5, 5, 3, NA)))

# weighted mean over a period of 3 days with at least 75% of
# coverage (NA is retunr if not) or 50%
ti4
changeSupport (ti4, 3L*d, 0.75)
changeSupport (ti4, 3L*d, 0.5)

# use of split.from
ti1 <- RegularTimeIntervalDataFrame('2011-01-01', '2011-02-01', 'hour')
ti1$value <- 1:nrow(ti1)
# we can calculate sliding mean over periods of 24 hours.
# first lets build the corresponding TimeIntervalDataFrame
ti2 <- RegularTimeIntervalDataFrame('2011-01-01', '2011-02-01', 'hour', 'day')
# if we try to 'project' ti1 over ti2 it won't work :
summary (changeSupport (ti1, ti2, 0))
# all data are NA because 'spliting' is not enabled. Let's enable it :
summary (changeSupport (ti1, ti2, 0, split.from=TRUE))

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.