thin_ts | R Documentation |
This function thins a time series by selecting every nth observation. The function provides flexibility to (a) distinguish among independent time series (e.g. time series for different individuals) via a user-supplied factor that can be determined flag_ts
and (b) return multiple thinned datasets for which thinning is started at different locations. The latter is useful for models of thinned time series because it is important to check that model inferences are not sensitive to the particular subset of data chosen.
thin_ts(dat, ind = NULL, flag1, first = 1, nth)
dat |
A dataframe to be thinned. |
ind |
A character input which defines the column name |
flag1 |
A character input which defines the column name in |
first |
A number or numeric vector which defines the position(s) at which thinning is initiated for each independent time series. If a single number is supplied, the function returns a thinned dataframe. If a vector of numbers is supplied, the function returns a list, of the same length, in which each element is a thinned dataframe comprising a different thinned dataframe - one with the same degree of thinning but in which thinning was initiated at a different position. The order of elements in the resultant list is the same as the order of elements in |
nth |
A number which defines the degree of thinning (i.e. the selection of every |
The function returns a list or dataframe, depending on the input to first
(see above).
Edward Lavender
#### Simulate a dataframe to be thinned # Define time stamps t <- c(seq.POSIXt(as.POSIXct("2016-01-01"), as.POSIXct("2016-01-02"), by = "6 hours"), seq.POSIXt(as.POSIXct("2016-01-02 18:00:00"), as.POSIXct("2016-01-04"), by = "6 hours") ) # Apply flag_ts() function to flag independent time series dat <- cbind(t, flag_ts(t, duration_threshold = 6*60, flag = 1:3)) nrow(dat) #### Example (1): Thin a single time series by selecting every nth position # Thin the time series dat_thin <- thin_ts(dat = dat, nth = 2, flag1 = "flag1" ) # Examine the rows retained: dat$row_retained <- dat$t %in% dat_thin$t dat #### Example (2): Thin multiple independent time series via the'ind' argument # Here, we now account for the fact that the data consists of multiple (two) independent time series # ... as identified by flag_ts(), and the selection of positions is identical for both time series # ... (i.e. the first observation in each time series) dat$row_retained <- NULL dat_thin <- thin_ts(dat = dat, nth = 2, flag1 = "flag1", ind = "flag3") dat$row_retained <- dat$t %in% dat_thin$t dat #### Example (3): Multiple thinned datasets can be produced by supplying multiple values to 'first' # This is useful because it is important to check that the exact thinned sample does not affect # ... results (e.g. if thinned data are used in modelling) dat$row_retained <- NULL dat_thin_ls <- thin_ts(dat = dat, nth = 2, flag1 = "flag1", ind = "flag3", first = c(1, 2)) # With multiple 'first' values, the function returns a list, with a thinned dataframe for each # ... first value: utils::str(dat_thin_ls) # Examine the difference: dat$row_retained1 <- dat$t %in% dat_thin_ls[[1]]$t dat$row_retained2 <- dat$t %in% dat_thin_ls[[2]]$t dat
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.