View source: R/preprocessing.R
fill_missing | R Documentation |
This function ensures that all days between the first and last date in the
data are present. It adds an accumulate
column that indicates whether
modelled observations should be accumulated onto a later data point.
point. This is useful for modelling data that is reported less frequently
than daily, e.g. weekly incidence data, as well as other reporting
artifacts such as delayed weekedn reporting. The function can also be used
to fill in missing observations with zeros.
fill_missing(
data,
missing_dates = c("ignore", "accumulate", "zero"),
missing_obs = c("ignore", "accumulate", "zero"),
initial_accumulate,
obs_column = "confirm",
by = NULL
)
data |
Data frame with a |
missing_dates |
Character. Options are "ignore" (the default),
"accumulate" and "zero". This determines how missing dates in the data are
interpreted. If set to "ignore", any missing dates in the observation
data will be interpreted as missing and skipped in the likelihood. If set
to "accumulate", modelled observations on dates that are missing in the
data will be accumulated and added to the next non-missing data point.
This can be used to model incidence data that is reported less frequently
than daily. In that case, the first data point is not included in the
likelihood (unless |
missing_obs |
Character. How to process dates that exist in the data
but have observations with NA values. The options available are the same
ones as for the |
initial_accumulate |
Integer. The number of initial dates to accumulate
if |
obs_column |
Character (default: "confirm"). If given, only the column specified here will be used for checking missingness. This is useful if using a data set that has multiple columns of hwich one of them corresponds to observations that are to be processed here. |
by |
Character vector. Name(s) of any additional column(s) where data processing should be done separately for each value in the column. This is useful when using data representing e.g. multiple geographies. If NULL (default) no such grouping is done. |
a data.table with an accumulate
column that indicates whether
values are accumulated (see the documentation of the data
argument in
estimate_infections()
)
cases <- data.table::copy(example_confirmed)
## calculate weekly sum
cases[, confirm := data.table::frollsum(confirm, 7)]
## limit to dates once a week
cases <- cases[seq(7, nrow(cases), 7)]
## set the second observation to missing
cases[2, confirm := NA]
## fill missing data
fill_missing(cases, missing_dates = "accumulate", initial_accumulate = 7)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.