enw_complete_dates | R Documentation |
Ensures that all reference and report dates are present for
all groups based on the maximum and minimum dates found in the data.
This function may be of use to users when preprocessing their data. In
general all features that you may consider using as grouping variables
or as covariates need to be included in the by
variable.
enw_complete_dates(
obs,
by = NULL,
max_delay,
min_date = min(obs$reference_date, na.rm = TRUE),
max_date = max(obs$report_date, na.rm = TRUE),
timestep = "day",
missing_reference = TRUE,
completion_beyond_max_report = FALSE,
flag_observation = FALSE
)
obs |
A |
by |
A character vector describing the stratification of observations. This defaults to no grouping. This should be used when modelling multiple time series in order to identify them for downstream modelling |
max_delay |
The maximum number of days to model in the delay
distribution. Must be an integer greater than or equal to 1. Observations
with delays larger then the maximum delay will be dropped. If the specified
maximum delay is too short, nowcasts can be biased as important parts of the
true delay distribution are cut off. At the same time, computational cost
scales non-linearly with this setting, so you want the maximum delay to be as
long as necessary, but not much longer. Consider what delays are realistic
for your application, and when in doubt, check if increasing the maximum
delay noticeably changes the delay distribution or nowcasts as estimated by
epinowcast. If it does, your maximum delay may still be too short.
Note that delays are zero indexed and so include the reference date and
|
min_date |
The minimum date to include in the data. Defaults to the minimum reference date found in the data. |
max_date |
The maximum date to include in the data. Defaults to the maximum report date found in the data. |
timestep |
The timestep to used. This can be a string ("day", "week", "month") or a numeric whole number representing the number of days. |
missing_reference |
Logical, should entries for cases with missing reference date be completed as well?, Default: TRUE |
completion_beyond_max_report |
Logical, should entries be completed beyond the maximum date found in the data? Default: FALSE |
flag_observation |
Logical, should observations that have been
imputed as missing be flagged as not observed?. Makes use of
|
A data.table
with completed entries for all combinations of
reference dates, groups and possible report dates.
Preprocessing functions
enw_add_delay()
,
enw_add_max_reported()
,
enw_add_metaobs_features()
,
enw_assign_group()
,
enw_construct_data()
,
enw_extend_date()
,
enw_filter_delay()
,
enw_filter_reference_dates()
,
enw_filter_report_dates()
,
enw_flag_observed_observations()
,
enw_impute_na_observations()
,
enw_latest_data()
,
enw_metadata()
,
enw_metadata_delay()
,
enw_missing_reference()
,
enw_preprocess_data()
,
enw_reporting_triangle()
,
enw_reporting_triangle_to_long()
obs <- data.frame(
report_date = c("2021-10-01", "2021-10-03"), reference_date = "2021-10-01",
confirm = 1
)
enw_complete_dates(obs)
# Allow completion beyond the maximum date found in the data
enw_complete_dates(obs, completion_beyond_max_report = TRUE, max_delay = 10)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.