cor.shrink: Estimation of correlation matrix

View source: R/BioTIP_update_04202022.R

cor.shrinkR Documentation

Estimation of correlation matrix

Description

This function takes in one (or two) matrix X (rows are genes, columns are samples) (or Y). It then calculates the average pairwise correlation between genes or samples. This method uses the method outlined by Schafer and Strimmer (2005).

Usage

cor.shrink(
  X,
  Y = NULL,
  MARGIN = c(1, 2),
  shrink = TRUE,
  target = c("zero", "average", "half")
)

Arguments

X

A G1 x S matrix of counts. Rows correspond to genes, columns correspond to samples.

Y

A G2 x S matrix of counts. Rows correspond to genes, columns correspond to samples. By default is NULL.

MARGIN

An integer indicate whether the rows (1, genes) or the columns (2, samples) to be calculated for pairwise correlation.

shrink

A flag specifying whether to shrink the correlation or not.

target

A character choose among ('zero', 'average', 'half'), indicating whether to shrink towards zero (for gene-gene correlations), shrink towards their empirical common average, one or 0.5 (for sample-sample correlations).

Value

The pairwise correlation between genes or samples. If Y==NULL, a G1 x G1 matrix is returned; otherwise, a G1 x G2 matrix is returned.

Author(s)

Andrew Goldstein andrewgoldstein@uchicago.edu

References

Schafer and Strimmer (2005) "A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics"

Examples

require(MASS)
## Generating a data X as coming from a multivariate normal distribution 
## with 10 highly correlated variables, roughly simulating correlated genes.
M = matrix(.9, nrow = 10, ncol = 10)
diag(M) = 1
mu = rnorm(10)
X = MASS::mvrnorm(500, mu, M)
dim(X)  #500 10  

## Calculating pairwise correlation among 500 correlated genes
cor_tX = cor(t(X))
mean(abs(cor_tX[upper.tri(cor_tX, diag = FALSE)])) # 0.9468098

## Calculating estimated pairwise correlation among 500 correlated genes
cor.matrix <- cor.shrink(X, MARGIN=1, shrink = TRUE, targe='zero') # 0.8287838
dim(cor.matrix)   #[1] 500 500
mean(upper.tri(cor.matrix, diag=FALSE))  # 0.499

## Calculating stimated pairwise correlation among 500 correlated genes 
## and additional 100 random genes
Y = rbind(X, matrix(rnorm(300*10), nrow = 300))
dim(Y)  #800 10 
cor.matrix <- cor.shrink(X, Y, MARGIN=1, shrink = TRUE, targe='zero') 
dim(cor.matrix)   #[1] 500 800
mean(upper.tri(cor.matrix, diag=FALSE))  # 0.6868


xyang2uchicago/NPS documentation built on June 30, 2024, 10:15 p.m.