eigenR2: Eigen-R2 for Dissecting Variation in High-dimensional Studies
In StoreyLab/eigenR2: Eigen-R2 for Dissecting Variation in High-dimensional Studies

Description Usage Arguments Details Value Author(s) References See Also Examples

Eigen-R2 is a high-dimensional version of the classic R-square statistic. It can be applied when one wants to determine the aggregate R-square value for many related response variables according to a common set of independent variables.

1	eigenR2(dat, model, null.model=NULL, adjust=FALSE, eigen.sig=NULL, mod.fit.func=NULL)

`dat`	The matrix of related response variables, where each row represents a distinct response variable.
`model`	A model matrix describing the fit of each response variable to the independent variables (see `model.matrix`).
`null.model`	If different than the simple mean, an optional null model to be used in calculating the denominator of the R-square statistic.
`adjust`	If TRUE, eigen-R2 makes the analogous small sample adjustment avaiable for the R-square statistic.
`eigen.sig`	An optional significance threshold to select the right eigen-vectors explaining more variation than expected by chance for use in the eigen-R2 calculation. If not NULL, the value of the p-value threshold to apply.
`mod.fit.func`	An optional function to estimate R-square. By default, the sum of squares will be minimized with respect to the model.

This function can be used for determining the aggregate R-square value for many related response variables according to a common set of independent variables.

This function can also be used to compute conditional eigen-R2, by specifying the null.model. See examples.

By specifying the significance threshold eigen.sig, only the statistically significant right eigenvectors will be used to compute eigen-R2.

By default, mod.fit.func is NULL and the function will compute eigen-R2s with simple linear regression. If mod.fit.func is not NULL, an R-square estimation function should be provided. The input of the function should be the response variables and the function returns the estimated R2 for the response variables.

The function returns a list containing:

`eigenR2`	The eigen-R2 estimate for the model.
`weights`	The proportion of total variation of the response variables that each eigenvector captures.
`eg.R2s`	The R-square for each significant eigenvectors.
`p.eg`	The p-value for each eigenvector, testing whether it explains more variation among the response variables than expected by chance.

Lin S. Chen lschen@princeton.edu and John D. Storey jstorey@princeton.edu

Chen LS and Storey JD (2008) Eigen-R2 for dissecting variation in high-dimensional studies.

plot.eigenR2

  ## Not run: 
  ## Load the simulated data.
  ## The data set contains age, genotype and IDs for 50 arrays.
  ## The expression matrix is a 200 genes by 50 array matrix
  data(eigSdat)
  exp <- eigSdat$exp
  exp <- t(apply(exp, 1, function(x) x-mean(x)))
  varList <- eigSdat$varList
  age <- varList$age
  geno <- varList$geno
  ID <- varList$ID

  ## the eigen-R2 for variable "age"
  mod1 <- model.matrix(~1+age)
  eigenR2.age <- eigenR2(dat = exp, model = mod1) 

  ## the eigen-R2 for variable "age"
  ## adjust for small sample size
  eigenR2.age.adj <- eigenR2(dat=exp, model=mod1, adjust=TRUE)
  ## Or specifying the significance threshold of eigen-genes
  eigenR2.age.adj2 <- eigenR2(dat=exp, model=mod1, eigen.sig=0.01)

  ## the conditional eigen-R2 for "genotype" given "age"
  mod2 <- model.matrix(~1+age+as.factor(geno))
  eigenR2.g <- eigenR2(dat=exp, model=mod2, null.model=mod1)

  ## use "mod.fit.func" to specify the estimation function
  func1 <- function(y) {summary(lm(y~mod1))$r.squared}
  eigenR2.age2 <- eigenR2(dat=exp, mod.fit.func=func1)

  ## use linear mixed effect model to estimate R2s
  library(nlme)
  func2 <- function(y) {
      m1 <- lme(y~1+age+as.factor(geno), random=~1|ID)
      r2 <- 1-sum(resid(m1)^2)/sum((y-mean(y))^2)
      return(r2)
  }
  eigenR2.ag <- eigenR2(dat=exp, mod.fit.func=func2)

  ## Plot the information on the eigen-genes
  plot.eigenR2(eigenR2.ag)
  
## End(Not run)