corFunctions | R Documentation |
Estimate the correlation of two vectors via fast C++ implementations, with a focus on robust and nonparametric methods.
corPearson(x, y)
corSpearman(x, y, consistent = FALSE)
corKendall(x, y, consistent = FALSE)
corQuadrant(x, y, consistent = FALSE)
corM(
x,
y,
prob = 0.9,
initial = c("quadrant", "spearman", "kendall", "pearson"),
tol = 1e-06
)
x , y |
numeric vectors. |
consistent |
a logical indicating whether a consistent estimate at the
bivariate normal distribution should be returned (defaults to |
prob |
numeric; probability for the quantile of the
|
initial |
a character string specifying the starting values for the
Huber M-estimator. For |
tol |
a small positive numeric value to be used for determining convergence. |
corPearson
estimates the classical Pearson correlation.
corSpearman
, corKendall
and corQuadrant
estimate the
Spearman, Kendall and quadrant correlation, respectively, which are
nonparametric correlation measures that are somewhat more robust.
corM
estimates the correlation based on a bivariate M-estimator of
location and scatter with a Huber loss function, which is sufficiently
robust in the bivariate case, but loses robustness with increasing dimension.
The nonparametric correlation measures do not estimate the same population
quantities as the Pearson correlation, the latter of which is consistent at
the bivariate normal model. Let \rho
denote the population
correlation at the normal model. Then the Spearman correlation estimates
(6/\pi) \arcsin(\rho/2)
, while the Kendall and
quadrant correlation estimate
(2/\pi) \arcsin(\rho)
. Consistent estimates are
thus easily obtained by taking the corresponding inverse expressions.
The Huber M-estimator, on the other hand, is consistent at the bivariate normal model.
The respective correlation estimate.
The Kendall correlation uses a naive n^2
implementation if
n < 30
and a fast O(n \log(n))
implementation for
larger values, where n
denotes the number of observations.
Functionality for removing observations with missing values is currently not implemented.
Andreas Alfons, O(n \log(n))
implementation of
the Kendall correlation by David Simcha
ccaGrid
, ccaProj
,
cor
## generate data
library("mvtnorm")
set.seed(1234) # for reproducibility
sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2)
xy <- rmvnorm(100, sigma=sigma)
x <- xy[, 1]
y <- xy[, 2]
## compute correlations
# Pearson correlation
corPearson(x, y)
# Spearman correlation
corSpearman(x, y)
corSpearman(x, y, consistent=TRUE)
# Kendall correlation
corKendall(x, y)
corKendall(x, y, consistent=TRUE)
# quadrant correlation
corQuadrant(x, y)
corQuadrant(x, y, consistent=TRUE)
# Huber M-estimator
corM(x, y)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.