CFinder: Identify potential confounding factors or batch effects

Description Usage Arguments Details Value Examples

View source: R/CFinder.R

Description

CFinder first does a principal component analysis via singular value decomposition, and compute correlation of top PCs with sample class information.

Usage

1
CFinder(data, pheno, type)

Arguments

data

A matrix, the normalized gene/microRNA expression dataset, should be a numeric matrix, with rows referring to genes/microRNAs and columns to samples.

pheno

A vector of sample phenotypes. Sample phenotype in a scientific research could be categorical (treatment/control), or continuous (age). If you have multiple phenotypes, use a list with names denoting the corresponding phenotype.

type

A string, 'categorical' if sample phenotype if categorical, 'continuous' if sample phenotype is continuous.

Details

This function computes the moderated t-statistic for users using empirical Bayes method, it is especially useful when the sample size is too small to perform parametric tests.

Given a normalized gene/microRNA expression data matrix and a vector indicating sample phenotype/class, CFinder first centers the input data matrix, then does a singular value decomposition to the data matrix. Then it computes the correlation of top PCs with potential confounding factors by using linear correlation (for continueous phenotype data like age) or Kruskal-Wallis rank sum test (for categorical phenotype data like tumor subtype).

Value

A list containing a vector of p-values for top k significant PCs, and a vector of the variation in the data explained by each PC.

Examples

1
2
3
4
5
6
7
# prepare your normalized data matrix
data.m <- matrix(rnorm(120), nrow = 20, ncol = 6)
# prepare the phenotype info ("C"-control; "T"-treatment)
class.v <- c('C', 'C', 'C', 'T', 'T', 'T')
# run function
p.v <- CFinder(data = data.m, pheno = class.v, type = 'categorical')$p
prop.v <- CFinder(data = data.m, pheno = class.v, type = 'categorical')$prop

YC3/mirNet documentation built on Sept. 3, 2020, 3:25 a.m.