quick.pheno.assocs: Quick association tests for phenotype

Description Usage Arguments Value Author(s) See Also Examples

View source: R/bigpca.R

Description

Simplistic association tests, only meant for purposes of preliminary variable selection or creation of priors, etc. Quickly obtain association p-values for a big.matrix against a list of phenotypes for each row, where columns are samples and column labels correspond to the rownames of the sample.info dataframe which contains the phenotype information, in a column labelled 'use.col'.

Usage

1
2
3
quick.pheno.assocs(bigMat, sample.info = NULL, use.col = "phenotype",
  dir = "", p.values = TRUE, F.values = TRUE, n.cores = 1,
  verbose = FALSE)

Arguments

bigMat

a big.matrix object, or any argument accepted by get.big.matrix(), which includes paths to description files or even a standard matrix object.

sample.info

a data.frame with rownames corresponding to colnames of the bigMat. Must also contain a column named 'use.col' (default 'phenotype') which contains the categorical variable to perform the association test for phenotype, etc. This file may contain extra ids not in colnames(bigMat), although if any column names of bigMat are missing from sample.info a warning will be given, and the call is likely to give incorrect results.

use.col

the name of the phenotype column in the data.frame 'sample.info'

dir

directory containing the filebacked.big.matrix, same as dir for get.big.matrix.

p.values

logical, whether to return p.values from the associations

F.values

logical, whether to return F.values from the associations

n.cores

integer, if wanting to process the analysis using multiple cores, specify the number

verbose

logical, whether to display additional output on progress

Value

Depending on options selected returns either a list of F values and p values, or just F, or just p-values for association with each variable in the big.matrix.

If both F.values and p.values are TRUE, returns dataframe of both statistics for each variable, else a vector. If the phenotype has 20 more or more unique categories, it will be assumed to be continuous and the association test applied will be correlation. If there are two categories a t-test will be used, and 3 to 19 categories, an ANOVA# will be used. Regardless of the analysis function, output will be converted to an F statistic and/or associated p-values. Except if p.values and F.values are both set to FALSE and the phenotype is continuous, then pearsons correlation values will be returned

Author(s)

Nicholas Cooper

See Also

get.big.matrix

Examples

1
2
3
4
5
6
bmat <- generate.test.matrix(5,big.matrix=TRUE)
pheno <- rep(1,ncol(bmat)); pheno[which(runif(ncol(bmat))<.5)] <- 2
ids <- colnames(bmat); samp.inf <- data.frame(phenotype=pheno); rownames(samp.inf) <- ids
both <- quick.pheno.assocs(bmat,samp.inf); prv(both)
Fs <- quick.pheno.assocs(bmat,samp.inf,verbose=TRUE,p.values=FALSE); prv(Fs)
Ps <- quick.pheno.assocs(bmat,samp.inf,F.values=FALSE); prv(Ps)

bigpca documentation built on Nov. 22, 2017, 1:02 a.m.