M3D_Simulations: Make Simulated Data

bg__MakeSimDataR Documentation

Make Simulated Data

Description

Makes simulated data based on a negative binomial distribution inflated with zeros based on the Michaelis-Menten equation.

Usage

	bg__MakeSimData(dispersion_fun=bg__default_mean2disp, n_cells=300, dispersion_factor=1, base_means=10^rnorm(25000, 1, 1), K=10.3)
	bg__MakeSimDE(dispersion_fun=bg__default_mean2disp, fold_change=10, frac_change=0.1, n_cells=300, sub_pop=0.5, dispersion_factor=1, base_means=10^rnorm(25000,1,1), K=10.3)
	bg__MakeSimDVar(dispersion_fun=bg__default_mean2disp, fold_change=10, frac_change=0.1, n_cells=300, sub_pop=0.5, dispersion_factor=1, base_means=10^rnorm(25000,1,1), K=10.3)
	bg__MakeSimHVar(dispersion_fun=bg__default_mean2disp, fold_change=10, frac_change=0.1, n_cells=300, dispersion_factor=1, base_means=10^rnorm(25000,1,1), K=10.3)

Arguments

dispersion_fun

a function which takes mean experssion and returns the dispersion parameter of the negative binomial distribution.

n_cells

total number of cells (columns) in the simulated dataset.

sub_pop

proportion of cells with changed expression.

frac_change

proportion of genes with changed expression.

fold_change

fold change in dispersion or mean expression.

dispersion_factor

a factor that multiplies the calculated mean-specific dispersion for all genes.

base_means

a vector of background mean expression values.

K

K of the Michaelis-Menten function

Details

Generates simulated single-cell gene expression data using a zero-inflated negative binomial distribution. A user-supplied function relates the dispersion parameter (1/size of the R parameterization of the negative binomial distribution). Zeros are added based on a Michaelis-Menten function.

Default values of base_means, K, and dispersion_fun were fit to the Buettner et al. 2015 data [1].

bg__MakeSimData generates simulated single-cell data for a single homogeneous population.

bg__MakeSimDE generates simulated single-cell data for two different populations where a proportion of genes have a fold_change difference in the mean for population "2".

bg__MakeSimDVar generates simulated single-cell data for two different populations where a proportion of genes have a fold_change difference in the dispersion for population "2".

bg__MakeSimHVar generates simulated single-cell data for a single homogeneous population where a proportion of genes have a fold_change increase in dispersion over the expectation given the mean expression of the gene.

Value

bg__MakeSimData : a gene expression matrix where rows are genes, columns are cells. bg__MakeSimDE, bg__MakeSimDVar, bg__MakeSimHVar : a list of three named items: data : the gene expression matrix where rows are genes, columns are cells cell_labels : a vector of 1 or 2 indicating which cells are the unchanged ("1") or changed ("2") population. TP : a vector of row IDs of those genes that change (true positives).

References

[1] Buettner et al. (2015) Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nature Biotechnology 33 : 155-160.

Examples

#  means = c(1,2,5,10,20,50,100,200,500,1000,2000,5000)
#  population1 <- bg__MakeSimData(n_cells=10, base_means=means)
#  population2 <- bg__MakeSimData(n_cells=10, base_means=means*2, dispersion_factor=0.5)
#  sim_DE <- bg__MakeSimDE(n_cells=100, base_means=means)
#  sim_DVar <- bg__MakeSimDVar(n_cells=100, sub_pop=0.25, base_means=means)
#  sim_HVar <- bg__MakeSimHVar(base_means=means, fold_change=3)

tallulandrews/M3Drop documentation built on March 6, 2024, 1:49 a.m.