dot-doData: Simulate a count table

Description Usage Arguments Details Value Author(s) Examples

Description

.doData creates a count table for all nodes of a tree under two different groups such that the tree would have different abundance patterns in the different conditions.

Usage

1
2
3
4
5
.doData(tree = NULL, data = NULL, scenario = "S1", from.A = NULL,
  from.B = NULL, minTip.A = 0, maxTip.A = Inf, minTip.B = 0,
  maxTip.B = Inf, minPr.A = 0, maxPr.A = 1, ratio = 2,
  adjB = NULL, pct = 0.6, nSam = c(50, 50), mu = 50,
  size = 10000, n = 1, fun = sum)

Arguments

tree

A phylo object

data

A matrix, representing a count table from real data. It has the entities corresponding to tree leaves in the row and samples in the column.

scenario

“S1”, “S2”, or “S3” (see Details). Default is “S1”.

from.A, from.B

The branch node labels of branches A and B for which the signal is swapped. Default, both are NULL. In simulation, we select two branches (A & B) to have differential abundance under different conditions. One could specify these two branches or let .doData choose. (Note: If from.A is NULL, from.B is set to NULL).

minTip.A

The minimum number of leaves in branch A

maxTip.A

The maximum number of leaves in branch A

minTip.B

The minimum number of leaves in branch B

maxTip.B

The maximum number of leaves in branch B

minPr.A

A numeric value selected from 0 to 1. The minimum abundance proportion of leaves in branch A

maxPr.A

A numeric value selected from 0 to 1.The maximum abundance proportion of leaves in branch A

ratio

The proportion ratio of branch B to branch A. This value is used to select branches(see Details). If there are no branches having exactly this ratio, the pair with the value closest to ratio would be selected.

adjB

a numeric value selected from 0 and 1 (only for scenario is “S3”). Default is NULL. If NULL, branch A and branch B swap their proportions. If a numeric value, e.g. 0.1, then branch B decreases to its one tenth proportion and the decrease in branch B is added to branch A. For example, assume there are two experimental conditions (C1 & C2), branch A has 10 and branch B has 40 in C1. If adjB is set to 0.1, then in C2 branch B becomes 4 and branch A 46 so that the total proportion stays the same.

pct

a numeric value selected from 0 and 1. The percentage of leaves in branch B that have differential abundance under different conditions (only for scenario “S3”)

nSam

A numeric vector of length 2, containing the sample size for two different conditions

mu, size

The parameters of the Negative Binomial distribution. (see mu and size in rnbinom). Parameters used to generate the library size for each simulated sample.

n

A numeric value to specify how many count tables would be generated with the same settings. Default is one and one count table would be obtained at the end. If above one, the output of .doData is a list of matrices (count tables). This is useful, when one needs multiple simulations.

fun

A function to derive the count at each internal node based on its descendant leaves, e.g. sum, mean. The argument of the function is a numeric vector with the counts of an internal node's descendant leaves.

Details

.doData simulates a count table for entities which are corresponding to the nodes of a tree. The entities are in rows and the samples from different groups or conditions are in columns. The library size of each sample is sampled from a Negative Binomial distribution with mean and size specified by the arguments mu and size. The counts of entities, which are located on the tree leaves, in the same sample are assumed to follow a Dirichlet-Multinomial distribution. The parameters for the Dirichlet-Multinomial distribution are estimated from a real data set specified by the argument data via the function dirmult (see dirmult). To generate different abundance patterns under different conditions, we provide three different scenarios, “S1”, “S2”, and “S3” (specified via scenario).

Value

a list of objects

FC

the fold change of entities correspondint to the tree leaves.

Count

a list of count table or a count table. Entities on the row and samples in the column. Each count table includes entities corresponding to all nodes on the tree structure.

Branch

the information about two selected branches.

A

the branch node label of branch A

B

the branch node label of branch B

ratio

the count proportion ratio of branch B to branch A

A_tips

the number of leaves on branch A

B_tips

the number of leaves on branch B

A_prop

the count proportion of branch A (not above 1)

B_prop

the count proportion of branch B (not above 1)

Author(s)

Ruizhu Huang

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## Not run: 
if(require(GUniFrac)){
data("throat.otu.tab")
data("throat.tree")

dat <- .doData(tree = throat.tree,
data = as.matrix(t(throat.otu.tab)),
ratio = 2)
}

## End(Not run)

markrobinsonuzh/treeAGG documentation built on May 26, 2019, 9:32 a.m.