# gof: GOF measures In berryFunctions: Function Collection Related to Plotting and Hydrology

## Description

Goodness of Fit measures (GOF) for two vectors.
gofNA: not exported, checks input for each of the functions:
rsquare: Coefficient of determination (R2)
rmse: Root Mean Square Error (for minimizing in `optim`)
nse: Nash-Sutcliffe efficiency, based on RHydro::eval.NSeff
kge: Kling-Gupta efficiency (better than NSE), based on hydroGOF::KGE, where there are many more options

## Usage

 ```1 2 3 4 5 6 7 8 9``` ```gofNA(a, b, quiet = FALSE, fun = "") rsquare(a, b, quiet = FALSE) rmse(a, b, quiet = FALSE) nse(a, b, quiet = FALSE) kge(a, b, quiet = FALSE) ```

## Arguments

 `a` Numerical vector with observational data `b` Simulated data (to be compared to a) `quiet` Should NA-removal warnings be suppressed? This may be helpful within functions. DEFAULT: FALSE `fun` Character string with function name for error and warning messages

## Value

Single numerical value

## Note

NAs are omitted with warning.

## Author(s)

Berry Boessenkool, [email protected], Sept 2016

`cor`, `lm`. http://en.wikipedia.org/wiki/R-squared, http://en.wikipedia.org/wiki/Mean_squared_error
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60``` ```# R squared and RMSE -------------------------------------------------------- set.seed(123) x <- rnorm(20) y <- 2*x + rnorm(20) plot(x,y) legGOF <- function(a,b) { text(a,b, paste(c(" R2","RMSE"," NSE"," KGE"), collapse="\n"), adj=1.2) text(a,b, paste(round(c(rsquare(x,y), rmse(x,y), nse(x,y), kge(x,y)),5), collapse="\n"), adj=0) } legGOF(-1.5, 2) # R2 good, but does not check for bias (distance from 1:1 line) abline(a=0,b=1) ; textField(-1.5,-1.5, "1:1") abline(lm(y~x), col="red") p <- predict(lm(y~x)) points(x, p, pch=3, col="red") segments(x, y, x, p, col="red") stopifnot(all.equal( nse(y,p) , rsquare(y,x) )) # Input checks is.error( rmse(1:6, 1:8) , tell=TRUE) nse(replace(x,3,NA), y) kge(rep(NA,20), y) rmse(0,0, quiet=TRUE) rsquare(1:6, tapply(chickwts\$weight, chickwts\$feed, mean) ) ## Not run: # time consuming Simulation # sample size bias x <- 1:1000 y <- x+rnorm(1000) rmse(x,y) # 0.983 ssize <- rep(5:1000, 3) sgofs <- sapply(ssize, function(n){i <- sample(1:1000,n); c(rsquare(x[i],y[i]),rmse(x[i],y[i]))}) plot(ssize, sgofs[2,]) # RMSE: no bias, symmetric convergence plot(ssize, sgofs[1,]) # R2: small underestimation in small samples if(require(pbapply)) sapply <- pbsapply r2 <- sapply(1:10000, function(i){ x <- rnorm(20); y <- 2*x + rnorm(20); c(rsquare(x,y), rmse(x,y)) }) hist(r2[1,], breaks=70, col=5, main= "10'000 times x <- rnorm(20); y <- 2*x + rnorm(20); rsquare(x,y)") # For small samples, R^2 can by chance be far off the 'real' value! hist(r2[2,], breaks=70, col=5, main= "... rsquare(x,y)") # RMSE is more symmetric and gaussian ## End(Not run) # NSE and KGE --------------------------------------------------------------- y <- dbeta(1:40/40, 3, 10) # simulated x <- y + rnorm(40,0,sd=0.2) # observed plot(x) lines(y, col="blue") legGOF(25, 2) rmse(x,y) ; rmse(y,x) nse(x,y) ; nse(y,x) # x=obs, y=sim (second command is wrong) kge(x,y) ; kge(y,x) ```