Corr: Computes the correlation coefficient between an array of...

View source: R/Corr.R

CorrR Documentation

Computes the correlation coefficient between an array of forecasts and their corresponding observations

Description

Calculates the correlation coefficient (Pearson, Kendall or Spearman) for an array of forecasts and observations. The input should be an array with dimensions c(no. of datasets, no. of start dates, no. of forecast times, no. of lons, no. of lats.), where the longitude and latitude dimensions are optional. The correlations are computed along the poscor dimension which should correspond to the startdate dimension. If compROW is given, the correlations are computed only if rows along the compROW dimension are complete between limits[1] and limits[2], i.e. there are no NAs between limits[1] and limits[2]. This option can be activated if the user wishes to account only for the forecasts for which observations are available at all leadtimes.
Default: limits[1] = 1 and limits[2] = length(compROW dimension).
The confidence interval is computed by a Fisher transformation.
The significance level relies on a one-sided student-T distribution.
We can modifiy the treshold of the test modifying siglev (default value=0.95).

.Corr calculates the correlation between the ensemble mean and the observations, using an N by M matrix (exp) of forecasts and a vector of observations (obs) as input.

Usage

Corr(
  var_exp,
  var_obs,
  posloop = 1,
  poscor = 2,
  compROW = NULL,
  limits = NULL,
  siglev = 0.95,
  method = "pearson",
  conf = TRUE,
  pval = TRUE
)

.Corr(exp, obs, siglev = 0.95, method = "pearson", conf = TRUE, pval = TRUE)

Arguments

var_exp

Array of experimental data.

var_obs

Array of observational data, same dimensions as var_exp except along posloop dimension, where the length can be nobs instead of nexp.

posloop

Dimension nobs and nexp.

poscor

Dimension along which correlation are to be computed (the dimension of the start dates).

compROW

Data taken into account only if (compROW)th row is complete. Default = NULL.

limits

Complete between limits[1] & limits[2]. Default = NULL.

siglev

Significance level. Default = 0.95.

method

Type of correlation: 'pearson', 'spearman' or 'kendall'. Default='pearson'

conf

Whether to compute confidence intervals (default = 'TRUE') or not (FALSE).

pval

Whether to compute statistical significance p-value (default = 'TRUE') or not (FALSE).

exp

N by M matrix of N forecasts from M ensemble members.

obs

Vector of the corresponding observations of length N.

Value

Corr: Array with dimensions :
c(# of datasets along posloop in var_exp, # of datasets along posloop in var_obs, 4, all other dimensions of var_exp & var_obs except poscor).
The third dimension, of length 4 maximum, contains to the lower limit of the 95% confidence interval, the correlation, the upper limit of the 95% confidence interval and the 95% significance level given by a one-sided T-test. If the p-value is disabled via pval = FALSE, this dimension will be of length 3. If the confidence intervals are disabled via conf = FALSE, this dimension will be of length 2. If both are disabled, this will be of length 2.

.Corr:

  • $corrThe correlation statistic.

  • $p_valCorresponds to the p values for the siglev% (only present if pval = TRUE) for the correlation.

  • $conf_lowCorresponds to the upper limit of the siglev% (only present if conf = TRUE) for the correlation.

  • $conf_highCorresponds to the lower limit of the siglev% (only present if conf = TRUE) for the correlation.

Author(s)

History:
0.1 - 2011-04 (V. Guemas) - Original code
1.0 - 2013-09 (N. Manubens) - Formatting to R CRAN
1.1 - 2014-10 (M. Menegoz) - Adding siglev argument
1.2 - 2015-03 (L.P. Caron) - Adding method argument
1.3 - 2017-02 (A. Hunter) - Adapted to veriApply()

Examples

# Load sample data as in Load() example: 
example(Load) 
clim <- Clim(sampleData$mod, sampleData$obs) 
ano_exp <- Ano(sampleData$mod, clim$clim_exp) 
ano_obs <- Ano(sampleData$obs, clim$clim_obs) 
runmean_months <- 12 
dim_to_smooth <- 4  
# Smooth along lead-times   
smooth_ano_exp <- Smoothing(ano_exp, runmean_months, dim_to_smooth) 
smooth_ano_obs <- Smoothing(ano_obs, runmean_months, dim_to_smooth) 
dim_to_mean <- 2  # Mean along members 
required_complete_row <- 3  # Discard start dates which contain any NA lead-times 
leadtimes_per_startdate <- 60 
corr <- Corr(Mean1Dim(smooth_ano_exp, dim_to_mean),              
            Mean1Dim(smooth_ano_obs, dim_to_mean),              
            compROW = required_complete_row,              
            limits = c(ceiling((runmean_months + 1) / 2),                         
            leadtimes_per_startdate - floor(runmean_months / 2))) 
 
PlotVsLTime(corr, toptitle = "correlations", ytitle = "correlation",             
           monini = 11, limits = c(-1, 2), listexp = c('CMIP5 IC3'),
           listobs = c('ERSST'), biglab = FALSE, hlines = c(-1, 0, 1),
           fileout = 'tos_cor.eps')
 

# The following example uses veriApply combined with .Corr instead of Corr
 ## Not run: 
require(easyVerification)  
Corr2 <- s2dverification:::.Corr
corr2 <- veriApply("Corr2", 
                  smooth_ano_exp, 
                  # see ?veriApply for how to use the 'parallel' option
                  Mean1Dim(smooth_ano_obs, dim_to_mean), 
                  tdim = 3, ensdim = 2)
 
## End(Not run)

s2dverification documentation built on April 20, 2022, 9:06 a.m.