util: Helpful utility functions

Description Usage Arguments Value Author(s) Examples

Description

csubset creates a subset of a count matrix, based on identity of column phenotypes to a specified value.

Usage

1
csubset(val, x, pheno, cidx = TRUE)

Arguments

val

character(1) specifying the subset of phenotype to select.

x

A matrix of counts, with rows corresponding to samples and columns to taxonomic groups.

pheno

A character() vector of length equal to the number of rows in count, indicating the phenotype of the corresponding sample.

cidx

A logical(1) indicating whether columns (taxa) with zero counts in the count matrix following removal of taxa not satisfying pheno %in% val should be removed. cidx=FALSE removes the 0-count columns.

Value

A matrix of counts, with rows satisfying pheno %in% val and with columns equal either to ncol(x) (when cidx=TRUE) or the number of columns with non-zero counts after row subsetting (cidx=FALSE).

Author(s)

Martin Morgan mailto:mtmorgan@fhcrc.org

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
## count matrix
fl <- system.file(package="DirichletMultinomial", "extdata",
                  "Twins.csv")
count <- t(as.matrix(read.csv(fl, row.names=1)))

## phenotype
fl <- system.file(package="DirichletMultinomial", "extdata",
                  "TwinStudy.t")
pheno0 <- scan(fl)
lvls <- c("Lean", "Obese", "Overwt")
pheno <- factor(lvls[pheno0 + 1], levels=lvls)
names(pheno) <- rownames(count)

## subset
dim(count)
sum("Lean" == pheno)
dim(csubset("Lean", count, pheno))
dim(csubset("Lean", count, pheno, cidx=FALSE))

DirichletMultinomial documentation built on Nov. 8, 2020, 7 p.m.