Description Usage Arguments Details Value See Also Examples

View source: R/calcStatCategorical.R

`CalcStatCategorical`

inputs a data.table or data.frame having two columns of
observation and model/forecast and computes some of the categorical verification measures
for either categorical or continous variables.

1 2 3 4 |

`DT` |
A data.table or dataframe: containing two columns of observation (truth) and the model/forecast |

`obsCol` |
Character: name of the observation column. |

`modCol` |
Character: name of the model/forecast column. |

`obsMissing` |
Numeric/Character vector: defining all the missing values in the observation |

`modMissing` |
Numeric/Character vector: defining all the missing values in the model/forecats |

`threshold` |
Numeric vector: Define it if you have numeric variables and you want to calculate the categorical statistics for different cutoff/threshod values |

`category` |
Vector with two elements. At this time only a 2 by 2 contigenc y table is supported. should be defined if the variable is actually categorical and threshold is NULL |

`groupBy` |
Character vector: Name of all the columns in |

`obsCondRange` |
Numeric vector: containing two elements (DEFAULT = c(-Inf,Inf)). Values are used as the lower and upper boundary for observation in calculating conditional statistics. If conditioning only at one tail, leave the second value as -Inf or Inf. For eaxmple, if interested on only values greater than 2, then obsCondRange = c(2, Inf) |

`modCondRange` |
Numeric vector: containing two elements (DEFAULT = c(-Inf,Inf)). Values are used as the lower and upper boundary for model/forecast in calculating conditional statistics. If conditioning only at one tail, leave the second value as -Inf or Inf. For eaxmple, if interested on only values greater than 2, then obsCondRange = c(2, Inf) |

`statList` |
Character vector: list of all the statistics you are interested. |

The calculated statistics are the following:

a : Hits in contingency table (both observation and forecast say YES)

b : False alarm in contingency table (observation says NO while forecast says YES)

c : Misses in contingency table (observation says YES while forecast says NO)

d : Correct rejection in contingency table (both observation and forecast say NO)

n : Total number of pairs =

*a+b+c+d*s : Base rate =

*(a+c)/n*r : Forecast rate = (a+b)/n,

B : Frequency bias = (a+b)/(a+c)

H : Hit rate = a/(a+c),

F : False alarm rate = b/(b+d),

FAR : False alarm ratio = b/(a+b),

PC : Proportion Correct = (a+d)/n,

CSI : Critical Success Index = a/(a+b+c),

GSS : Gilbert Skill Score = (a-ar)/(a+b+c-ar), where ar = (a+b)(a+c) /n is the expected a for a random forecast with the same r and s

HSS : Heidke Skill Score = (a+d-ar-dr)/(n-ar-dr), where dr = (b+d)(c+d)/n

PSS : Pierce Skilll Score = (a*d-b*c)/((b+d)*(a+c)),

CSS : Clayton Skill Scrore = a/(a+b)-c/(c+d),

DSS : Doolittle Skill Score = (a*d-b*c)/sqrt((a+b)*(c+d)*(a+c)*(b+d)),

LOR : Log of Odds Ratio = log(a*d/(b*c)),

ORSS : Odds Ratio Skill Score = (a*d-b*c)/(a*d+b*c),

EDS : Extreme Dependency Score = 2*log((a+c)/n)/log(a/n),

SEDS : Symmetric Extreme Dependency Score = log(ar/a)/log(a/n),

SEDI : Symmetric External Dependence Index= (log(b/(b+d))-log(a/(a+c))+log(1-a/(a+c))-log(1-b/(b+d)))/(log(b/(b+d))+log(a/(a+c))+log(1-a/(a+c)))+log(1-b/(b+d))

For more information refer to Forecast Verification, A Practitioner Guide in Atmospheric Science. Jollife and Stephenson, 2012.

data.frame containing all the requested statistics in `statList`

Other modelEvaluation: `CalcModPerfMulti`

,
`CalcModPerf`

,
`CalcNoahmpFluxes`

,
`CalcNoahmpWatBudg`

,
`CalcStatCont`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ```
## Not run:
# for categorical data
ExampleDF <- data.frame(obs=c(rep("YES",25), rep("NO", 25)), mod=rep(c("YES","NO"),25))
stat <- CalcStatCategorical(DT = ExampleDF, obsCol = "obs",
modCol = "mod", category = c("YES","NO"))
# for categorical data with more than one experiment
ExampleDF <- data.frame(obs=c(rep("YES",25), rep("NO", 25)), mod=rep(c("YES","NO"),25),
Experiment = c(rep(c("1","2","3"),16),"1","2"))
stat <- CalcStatCategorical(DT = ExampleDF, obsCol = "obs", modCol = "mod",
category = c("YES","NO"), groupBy="Experiment")
# for continuous data with different threshold values
ExampleDF <- data.frame(obs=rnorm(10000, 100, 10), mod=rnorm(10000, 100, 10))
stat <- CalcStatCategorical(DT = ExampleDF, obsCol = "obs", modCol = "mod",
threshold = c(60,70,80,90,100,110, 120, 130, 140))
ExampleDF <- data.frame(obs=rnorm(10000, 100, 10), mod=rnorm(10000, 100, 10),
Experiment=rep(c("Model1","Model2"),5000))
stat <- CalcStatCategorical(DT = ExampleDF, obsCol = "obs", modCol = "mod",
threshold = c(60,70,80,90,100,110, 120, 130, 140), groupBy = "Experiment")
## End(Not run)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.