ens_read_and_verify: Read forecast and observations and verify.

View source: R/ens_read_and_verify.R

ens_read_and_verifyR Documentation

Read forecast and observations and verify.

Description

This is a wrapper for the verification process. Forecasts and observations are read in, filtered down to common cases, errors checked, and a full verification is done for all scores. To minimise memory usage, the verification can be done for one lead time at time. It would also be possible to parallelise the process using for example mclapply, or future_map.

Usage

ens_read_and_verify(
  start_date,
  end_date,
  parameter,
  fcst_model,
  fcst_path,
  obs_path,
  lead_time = seq(0, 48, 3),
  num_iterations = length(lead_time),
  verify_members = TRUE,
  thresholds = NULL,
  members = NULL,
  vertical_coordinate = c(NA_character_, "pressure", "model", "height"),
  fctable_file_template = "fctable_eps",
  obsfile_template = "obstable",
  groupings = "leadtime",
  by = "6h",
  lags = "0s",
  merge_lags_on_read = TRUE,
  lag_fcst_models = NULL,
  parent_cycles = NULL,
  lag_direction = 1,
  fcst_shifts = NULL,
  keep_unshifted = FALSE,
  drop_neg_leadtimes = TRUE,
  climatology = "sample",
  stations = NULL,
  scale_fcst = NULL,
  scale_obs = NULL,
  spread_drop_member = NULL,
  jitter_fcst = NULL,
  common_cases_only = TRUE,
  common_cases_xtra_cols = NULL,
  check_obs_fcst = TRUE,
  gross_error_check = TRUE,
  min_allowed = NULL,
  max_allowed = NULL,
  num_sd_allowed = NULL,
  show_progress = FALSE,
  verif_path = NULL
)

Arguments

start_date

Start date to for the verification. Should be numeric or character. YYYYMMDD(HH)(mm).

end_date

End date for the verification. Should be numeric or character.

parameter

The parameter to verify.

fcst_model

The forecast model(s) to verify. Can be a single string or a character vector of model names.

fcst_path

The path to the forecast FCTABLE files.

obs_path

The path to the observation OBSTABLE files.

lead_time

The lead times to verify.

num_iterations

The number of iterations per verification calculation. The default is to do the same number of iterations as there are lead times. If a small number of iterations is set, it may be useful to set show_progress = TRUE. The higher the number of iterations, the smaller the amount of data that is held in memory at any one time.

verify_members

Whether to verify the individual members of the ensemble. Even if thresholds are supplied, only summary scores are computed. If you wish to compute categorical scores, the separate det_verify function must be used.

thresholds

The thresholds to compute categorical scores for.

members

The members to retrieve if reading an EPS forecast. To select the same members for all forecast models, this should be a numeric vector. For specific members from specific models a named list with each element having the name of the forecast model and containing a a numeric vector. e.g.
members = list(eps_model1 = seq(0, 3), eps_model2 = c(2, 3)).
For multi model ensembles, each element of this named list should contain another named list with sub model name followed by the desired members, e.g.
members = list(eps_model1 = list(sub_model1 = seq(0, 3), sub_model2 = c(2, 3)))

fctable_file_template

The template for the file names of the files to be read from. This would normally be one of the "fctable_*" templates that can be seen in show_file_templates. Can be a single string, a character vector or list of the same length as fcst_model. If not named, the order of templates is assumed to be the same as in fcst_model. If named, the names must match the entries in fcst_model.

obsfile_template

The template for OBSTABLE files - the default is "obstable", which is OBSTABLE_{YYYY}.sqlite.

groupings

The groups to verify for. The default is "leadtime". Another common grouping might be groupings = c("leadtime", "fcst_cycle").

by

The frequency of forecast cycles to verify.

merge_lags_on_read
climatology

The climatology to use for the Brier Skill Score. Can be "sample" for the sample climatology (the default), a named list with elements eps_model and member to use a member of an eps model in the harp_fcst object for the climatology, or a data frame with columns for threshold and climatology and also optionally leadtime.

stations

The stations to verify for. The default is to use all stations from station_list that are common to all fcst_model domains.

spread_drop_member

Which members to drop for the calculation of the ensemble variance and standard deviation. For harp_fcst objects, this can be a numeric scalar - in which case it is recycled for all forecast models; a list or numeric vector of the same length as the harp_fcst object, or a named list with the names corresponding to names in the harp_fcst object.

jitter_fcst

A function to perturb the forecast values by. This is used to account for observation error in the rank histogram. For other statistics it is likely to make little difference since it is expected that the observations will have a mean error of zero.

gross_error_check

Logical of whether to perform a gross error check.

min_allowed

The minimum value of observation to allow in the gross error check. If set to NULL the default value for the parameter is used.

max_allowed

The maximum value of observation to allow in the gross error check. If set to NULL the default value for the parameter is used.

num_sd_allowed

The number of standard deviations of the forecast that the obseravtions should be within. Set to NULL for automotic value depeninding on parameter.

show_progress

Logical - whether to show a progress bar. Defaults to FALSE.

verif_path

If set, verification files will be saved to this path.

Value

A list containting two data frames: ens_summary_scores and ens_threshold_scores.


andrew-MET/harpPoint documentation built on Feb. 23, 2023, 1:06 a.m.