predx_classes_v2.md
In cdcepi/predx: Tools for working with predictions

Numerical predictions with continuous real numbers. For all continuous predictions, minima and maxima are defined as lower and upper (defaults: lower = -Inf, upper = Inf).

A numeric point prediction.

CSV column name: point

Validity: - Numeric - Not NA - lower <= point <= upper

Numeric samples for continous outcomes between (and including) lower and upper.

CSV column name: sample

Validity: - Numeric - No NAs - lower <= sample <= upper

Predictions specified as a set of probabilities corresponding to a discrete set of bins across a range of possible numeric outcomes defined by lower and upper. The specific bins may be specified by a bin interval (generates equally sized bins) or a vector of specific bins defined by the lower bounds of each bin. Either version assumes the lower bound is inclusive and upper bound not inclusive, except for the final bin ending at upper. For example, for observable values in x, the bins include the probability that observation x is greater than or equal to the bin-specific lower bound and less than the bin-specific upper bound: bin_lwr <= x < bin_upr, except for at the upper bound, where bin_lwr <= x <= upper.

Interval-defined binned predictions

Bins are defined by bounds and intervals. For example, Continous(prob = probs, type='Bin', lower = 0, upper = 100, interval = 1) requires 101 probabilities (probs) that cover the bins 0 <= probs[1] < 1, 1 <= probs[2] < 2, ... 98 <= probs[1] < 99, 99 <= probs[101] <= 100.

Interval-defined binned predictions are represented internally as a list of: - lower: the lower bound of the range of possible predictions - upper: the upper bound of the range of possible predictions - interval: the span of each bin - prob: the ordered probabilities assigned to each bin

Validity: - All inputs are numeric - No NAs - lower != -Inf and upper != Inf - A probability is specified for each bin defined by lower, upper, and interval - The sum of prob is 1.0

Bin-defined binned predictions

Bins are defined explicitly by their lower bounds. For example, Continous(prob = probs, type='Bin', lwr = lwr_bounds) defines the bins by their lower bounds (lwr) and accepts an equal number of probabilities (probs), which are associated in order with those bins.

Bin-defined binned predictions are represented internally as a data.frame with two columns: - lwr: inclusive numeric lower bounds for sequential bins (equal intervals) - prob: probabilities assigned to each bin

Validity: - All inputs are numeric - No NAs - lower != -Inf - lwr[1] == lower - max(lwr) < upper - A probability is specified for each bin (length(prob) == length(lwr)) - The sum of prob is 1.0

Predictions characterized by parametric distributions defined according to base R. Distribution truncation has not been configure, so upper and lower should not be specified and default to those for the respective distribution.

Parametric predictions are represented internally as a data.frame with 2 columns: - parameter_name with the parameter name (from the set describe below) - parameter_value the corresponding numeric parameter

The following distributions and parameters are currently supported: - Normal: mean, sd (Support: lower = -Inf, upper = Inf) - Log-normal: meanlog, sdlog (Support: lower = 0, upper = Inf) - Gamma: shape, rate (or shape, scale) (Support: lower = 0, upper = Inf) - Beta: shape1, shape2 (Support: lower = 0, upper = 1)

Validity: - The supplied parameter names (parameter_name) must exactly match those of the specified parametric distribution - The parameter values (parameter_value) must be numeric and not include NA - The parameter values (parameter_value) must be appropriate for the specified parametric distribution (e.g. shape > 0) - lower and upper must be equivalent to those of the specified parametric distribution

Quantitative discrete numeric forecasts.

Point

A numeric point prediction.

CSV column name: point

Validity: - Numeric - Not NA

A prediction of the most likely categorical outcome.

CSV column name: point

Validity: - String - Not NA

A prediction of the most likely categorical outcome.

CSV column name: point

Validity: - Numeric - Not NA - 0 <= value <= 1

All dates are formated in ISO standard format: YYYY-MM-DD. Forecasts may be specific to a time period, such as a week, month, or year. Those should be consistently defined in the context of the forecast as they are not defined explicitly in the predx object.

Times are formatted in ISO standard 24 hour format: YYYY-MM-DDTHH:MM:SS+HH:MM, where the final HH:MM is the adjustment for the time zone compared to Coordinated Universal Time (UTC). If the final :MM in the time zone is :00, it may be dropped. Examples: - 2020-12-18T13:20:37+00:00 is 13:20:37 (1:20 PM with 37 seconds) on 12 December 2020 in UTC (Greenwich Mean Time) - 2025-01-03T02:00:00-05:00 or 2025-01-03T02:00:00-05 is 2:00 (2:00 AM) on 3 January 2025 in UTC-05 (Eastern Standard Time).

A prediction of the most likely date.

CSV column name: point

Validity: - ISO date or time (described above) - Not NA

All dates are formated in

A prediction of the most likely categorical outcome.

CSV column name: point

Validity: - Numeric - Not NA - 0 <= value <= 1

A character string point prediction, e.g. associated with SampleCat or BinCat.

CSV column name: point

Validity: - Not NA

A numeric probability.

CSV column name: prob

Validity: - Not NA - 0 <= prob <= 1

Binned distribution defined by inclusive lower bounds for each bin.

A data.frame object with two columns: - lwr: inclusive numeric lower bounds for sequential bins (equal intervals) - prob: probabilities assigned to each bin

CSV column names: lwr, prob

Validity: - No NAs in lwr or prob - Probabilities are positive - Probabilities sum to 1.0 - Bins are in ascending order - Bin sizes are uniform

Binned distribution with a category for each bin.

A data.frame object with two columns: - cat: character strings representing each possible outcome category - prob: probabilities assigned to each bin

CSV column names: cat, prob

Validity: - No NAs in lwr or prob - Probabilities are positive - Probabilities sum to 1.0

Numeric samples.

CSV column name: sample

Validity: - No NAs

Character string samples.

CSV column name: sample

Validity: - No NAs

cdcepi/predx documentation built on Dec. 29, 2019, 4:58 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

cdcepi/predx
Tools for working with predictions

predx_classes_v2.md
In cdcepi/predx: Tools for working with predictions

Continuous

Point

Samples

Binned

Parametric

Discrete

Point

Categorical

Point

Binary

Point

Dates & Times

Point

Time

Point

Point predictions

PointCat

Continuous distributions

Normal: mean, sd

Log-normal: mean, sd

Gamma: shape, rate

Beta: a, b

Discrete distributions

Binary: prob

Binomial: p, n

Poisson: mean

Negative-Binomial: r, p

Negative-Binomial2: mean, dispersion

Empirical distributions

BinLwr

BinCat

Sample

SampleCat

R Package Documentation

Browse R Packages

We want your feedback!

cdcepi/predx Tools for working with predictions

predx_classes_v2.md In cdcepi/predx: Tools for working with predictions

Continuous

Point

Samples

Binned

Parametric

Discrete

Point

Categorical

Point

Binary

Point

Dates & Times

Point

Time

Point

Point predictions

PointCat

Continuous distributions

Normal: mean, sd

Log-normal: mean, sd

Gamma: shape, rate

Beta: a, b

Discrete distributions

Binary: prob

Binomial: p, n

Poisson: mean

Negative-Binomial: r, p

Negative-Binomial2: mean, dispersion

Empirical distributions

BinLwr

BinCat

Sample

SampleCat

R Package Documentation

Browse R Packages

We want your feedback!

cdcepi/predx
Tools for working with predictions

predx_classes_v2.md
In cdcepi/predx: Tools for working with predictions