CalcStatCategorical: Calculate some categorical verification measures for...

Description Usage Arguments Details Value See Also Examples

View source: R/calcStatCategorical.R

Description

CalcStatCategorical inputs a data.table or data.frame having two columns of observation and model/forecast and computes some of the categorical verification measures for either categorical or continous variables.

Usage

1
2
3
4
CalcStatCategorical(DT, obsCol, modCol, obsMissing = NULL,
  modMissing = NULL, threshold = NULL, category = c("YES", "NO"),
  groupBy = NULL, obsCondRange = c(-Inf, Inf), modCondRange = c(-Inf,
  Inf), statList = c("H", "FAR", "CSI"))

Arguments

DT

A data.table or dataframe: containing two columns of observation (truth) and the model/forecast

obsCol

Character: name of the observation column.

modCol

Character: name of the model/forecast column.

obsMissing

Numeric/Character vector: defining all the missing values in the observation

modMissing

Numeric/Character vector: defining all the missing values in the model/forecats

threshold

Numeric vector: Define it if you have numeric variables and you want to calculate the categorical statistics for different cutoff/threshod values

category

Vector with two elements. At this time only a 2 by 2 contigenc y table is supported. should be defined if the variable is actually categorical and threshold is NULL

groupBy

Character vector: Name of all the columns in DT which the statistics should be classified based on.

obsCondRange

Numeric vector: containing two elements (DEFAULT = c(-Inf,Inf)). Values are used as the lower and upper boundary for observation in calculating conditional statistics. If conditioning only at one tail, leave the second value as -Inf or Inf. For eaxmple, if interested on only values greater than 2, then obsCondRange = c(2, Inf)

modCondRange

Numeric vector: containing two elements (DEFAULT = c(-Inf,Inf)). Values are used as the lower and upper boundary for model/forecast in calculating conditional statistics. If conditioning only at one tail, leave the second value as -Inf or Inf. For eaxmple, if interested on only values greater than 2, then obsCondRange = c(2, Inf)

statList

Character vector: list of all the statistics you are interested.

Details

The calculated statistics are the following:

For more information refer to Forecast Verification, A Practitioner Guide in Atmospheric Science. Jollife and Stephenson, 2012.

Value

data.frame containing all the requested statistics in statList

See Also

Other modelEvaluation: CalcModPerfMulti, CalcModPerf, CalcNoahmpFluxes, CalcNoahmpWatBudg, CalcStatCont

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
## Not run: 

# for categorical data
ExampleDF <- data.frame(obs=c(rep("YES",25), rep("NO", 25)), mod=rep(c("YES","NO"),25))
stat <- CalcStatCategorical(DT = ExampleDF, obsCol = "obs", 
modCol = "mod", category = c("YES","NO"))

# for categorical data with more than one experiment
ExampleDF <- data.frame(obs=c(rep("YES",25), rep("NO", 25)), mod=rep(c("YES","NO"),25), 
Experiment = c(rep(c("1","2","3"),16),"1","2"))
stat <- CalcStatCategorical(DT = ExampleDF, obsCol = "obs", modCol = "mod", 
category = c("YES","NO"), groupBy="Experiment")

# for continuous data with different threshold values
ExampleDF <- data.frame(obs=rnorm(10000, 100, 10), mod=rnorm(10000, 100, 10))
stat <- CalcStatCategorical(DT = ExampleDF, obsCol = "obs", modCol = "mod", 
threshold = c(60,70,80,90,100,110, 120, 130, 140))

ExampleDF <- data.frame(obs=rnorm(10000, 100, 10), mod=rnorm(10000, 100, 10), 
Experiment=rep(c("Model1","Model2"),5000))
stat <- CalcStatCategorical(DT = ExampleDF, obsCol = "obs", modCol = "mod", 
threshold = c(60,70,80,90,100,110, 120, 130, 140), groupBy = "Experiment")

## End(Not run)

mccreigh/rwrfhydro documentation built on May 12, 2018, 3:08 a.m.