select.least.assoc: Select subset of rows least associated with a categorical...

Description Usage Arguments Value Author(s) See Also Examples

View source: R/bigpca.R

Description

Runs a quick association analysis on the dataset against a phenotype/categorical variable stored in a dataframe, and uses the results as a way to select a subset of the original matrix, so you may wish to select the 'N' least associated variables, or the 'N' most associated.

Usage

1
2
select.least.assoc(bigMat, keep = 0.05, phenotype = NULL, least = TRUE,
  dir = "", n.cores = 1, verbose = TRUE)

Arguments

bigMat

a big.matrix object, or any argument accepted by get.big.matrix(), which includes paths to description files or even a standard matrix object.

keep

numeric, by default a proportion (decimal) of the original number of rows/columns to choose for the subset. Otherwise if an integer>2 then will assume this is the size of the desired subset, e.g, for a dataset with 10,000 rows where you want a subset size of 1,000 you could set 'keep' as either 0.1 or 1000.

phenotype

a vector which contains the categorical variable to perform an association test for phenotype, etc. This should be the same length as the number of columns (e.g, samples) in bigMat.

least

logical, whether to select TRUE, the top least associated variables, or FALSE, the most associated.

dir

directory containing the filebacked.big.matrix, same as dir for get.big.matrix.

n.cores

integer, if wanting to process the analysis using multiple cores, specify the number

verbose

logical, whether to display additional output

Value

A set of row or column indexes (depents on 'rows' parameter) of the variables most dependent (or indepent) variables measured by association with a [continuous/categorical] phenotype.

Author(s)

Nicholas Cooper

See Also

quick.pheno.assocs

Examples

1
2
3
4
5
6
bmat <- generate.test.matrix(5,big.matrix=TRUE)
pheno <- rep(1,ncol(bmat)); pheno[which(runif(ncol(bmat))<.5)] <- 2
most.correl <- select.least.assoc(bmat,phenotype=pheno,least=FALSE)
least.correl <- select.least.assoc(bmat,phenotype=pheno,least=TRUE)
cor(bmat[least.correl,][1,],pheno)  # least correlated
cor(bmat[most.correl,][1,],pheno)  # most correlated

bigpca documentation built on Nov. 22, 2017, 1:02 a.m.