dacomp.generate_example_dataset.two_sample: Generate a simulated two sample dataset, based on data from...
In barakbri/dacomp: Non parametric differential abundance testing for microbiome counts data.

Description Usage Arguments Details Value References Examples

View source: R/dacomp_generate_example_data.R

This function generates a two-sample dataset, based on the kostic dataset (Kostic et. al. 2012) from the phyloseq package (McMurdie et. al. 2012). Simulated data is generated in a procedure similar to the one presented in Brill et. al. 2019, Subsection 4.1. See additionals details below.

dacomp.generate_example_dataset.two_sample(
  n_X = 30,
  n_Y = 30,
  m1 = 30,
  signal_strength_as_change_in_microbial_load = 0.1
)

`n_X`	Number of samples from the first group
`n_Y`	Number of samples from the second group
`m1`	Number of differentially abundant taxa
`signal_strength_as_change_in_microbial_load`	A number in the range 0-0.75, indicating the fraction of the microbial load of group Y that is added due to the simulated condition. The complement of this fraction, is the fraction of the microbial load of group Y that is distribued across taxa as in group X.

Data is generated as follows. In the first step, we generate a list of vectors of relative frequencies to sample from: only healthy subjects from the kostic colorectal dataset are selected. Samples with less than 500 reads are dropped. Only OTUs that appear in 2 or more subjects are retained. In the second step, samples for group X are generated. For each sample, a vector of frequencies is chosen at random from the list generated in the first step. The observed sampled are multinomial random variables with a probability vector matching the selected frequencies, and a total number of reads realized from a Poisson distribution with a mean number of reads equal to the median number of reads across the samples listed in the first step. In the third step, samples for group Y are generated. For each sample, a vector of frequencies is chosen at random, similar to group X. The frequencies of differentially abundant taxa is increased, with the increase realized from a poisson random variable, such that the total increase in microbial load across all differentially abundant taxa is equivlant to the signal strength specified by the user. Observed counts are sampled based on the updated frequencies. This function requires the phyloseq package from bioconductor.

a list with the following entries

countsA counts matrix with (n_X + n_Y) rows, and 1384 columns, rows represent samples,columns represent taxa.
group_labelsA vector of group labelings, with values 0 and 1
select_diff_abundantA vector containing the indices of taxa that are differentially abundant.
taxonomyA table for the taxonomic affiliation of OTUs in the simulated dataset.

Brill, Barak, Amnon Amir, and Ruth Heller. 2019. Testing for Differential Abundance in Compositional Counts Data, with Application to Microbiome Studies. arXiv Preprint arXiv:1904.08937.

Kostic, Aleksandar D, Dirk Gevers, Chandra Sekhar Pedamallu, Monia Michaud, Fujiko Duke, Ashlee M Earl, Akinyemi I Ojesina, et al. 2012. Genomic Analysis Identifies Association of Fusobacterium with Colorectal Carcinoma. Genome Research 22 (2). Cold Spring Harbor Lab: 292–98.

McMurdie, Paul J, and Susan Holmes. 2013. Phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data. PloS One 8 (4). Public Library of Science: e61217.

## Not run: 
library(dacomp)

set.seed(1)
data = dacomp.generate_example_dataset.two_sample(m1 = 100,
       n_X = 50,
       n_Y = 50,
       signal_strength_as_change_in_microbial_load = 0.1)




## End(Not run)

barakbri/dacomp documentation built on June 17, 2021, 11:20 p.m.

barakbri/dacomp index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

barakbri/dacomp
Non parametric differential abundance testing for microbiome counts data.

dacomp.generate_example_dataset.two_sample: Generate a simulated two sample dataset, based on data from...
In barakbri/dacomp: Non parametric differential abundance testing for microbiome counts data.

Description

Usage

Arguments

Details

Value

References

Examples

Related to dacomp.generate_example_dataset.two_sample in barakbri/dacomp...

R Package Documentation

Browse R Packages

We want your feedback!

barakbri/dacomp Non parametric differential abundance testing for microbiome counts data.

dacomp.generate_example_dataset.two_sample: Generate a simulated two sample dataset, based on data from... In barakbri/dacomp: Non parametric differential abundance testing for microbiome counts data.

Description

Usage

Arguments

Details

Value

References

Examples

Related to dacomp.generate_example_dataset.two_sample in barakbri/dacomp...

R Package Documentation

Browse R Packages

We want your feedback!

barakbri/dacomp
Non parametric differential abundance testing for microbiome counts data.

dacomp.generate_example_dataset.two_sample: Generate a simulated two sample dataset, based on data from...
In barakbri/dacomp: Non parametric differential abundance testing for microbiome counts data.