simulate_hierarchicell_continuous: Simulate Expression Data for a Continuous Measure
In kdzimm/hierarchicell: Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data

Description Usage Arguments Details Value Note Examples

View source: R/03_simulate_count_matrix.R

This function will compute a simulation that will borrow information from the input data (or the package default data) to simulate data under a variety of pre-determined conditions. These conditions include correlation between fold change and the continuous measure of interest, number of genes, number of samples (i.e., independent experimental units), and the mean number of cells per individual. The simulation incorporates information about the cell-wise dropout rates and library sizes from the unnormalized data and the gene-wise grand means, gene-wise dropout rates, inter-individual variance, and intra-individual variance from the normalized data.

simulate_hierarchicell_continuous(
  data_summaries,
  n_genes = 1000,
  n_individuals = 3,
  cells_per_individual = 100,
  ncells_variation_type = "Poisson",
  rho = 1,
  continuous_mean = 0,
  continuous_sd = 1,
  decrease_dropout = 0,
  tSNE_plot = FALSE
)

`data_summaries`	an R object that has been output by the package's compute_data_summaries function. No default
`n_genes`	an integer. The number of genes you would like to simulate for your dataset. Too large of a number may cause memory failure and may slow the simulation down tremendously. We recommend an integer less than 40,000. Defaults to 10,000.
`n_individuals`	an integer. The number of independent samples for simulation. If not specifying a foldchange, the number of cases and controls does not matter. Defaults to 3.
`cells_per_individual`	an integer. The mean number of cells per control you would like to simulate. Too large of a number may cause memory failure and may slow the simulation down tremendously. We recommend an integer less than 300, but more is possible. We note that anything greater than 100, brings marginal improvements in power. Defaults to 100.
`ncells_variation_type`	either "Poisson", "NB", or "Fixed". Allows the number of cells per individual to be fixed at exactly the specified number of cells per individual, vary slightly with a poisson distribution with a lambda equal to the specified number of cells per individual, or a negative binomial with a mean equal to the specified number of cells and dispersion size equal to one.Defaults to "Poisson".
`rho`	a number between -1 and 1. The amount of correlation between fold change and the continuous measure of interest.Defaults to 1.
`continuous_mean`	A number. The mean for your continuous measure of interest. Assumes a normal distribution.Defaults to 0.
`continuous_sd`	A number. The standard deviation for your continuous measure of interest. Assumes a normal distribution.Defaults to 1.
`decrease_dropout`	a numeric proportion between 0 and 1. The proportion by which you would like to simulate decreasing the amount of dropout in your data. For example, if you would like to simulate a decrease in the amount of dropout in your data by twenty percent, then 0.2 would be appropriate. This component of the simulation allows the user to adjust the proportion of dropout if they believe future experiments or runs will have improved calling rates (due to improved methods or improved cell viability) and thereby lower dropout rates. Defaults to 0.
`tSNE_plot`	a TRUE/FALSE statement for the output of a tSNE plot to observe the global behavior of your simulated data. Seurat will need to be installed for this function to properly work. Defaults to FALSE.

Prior to running the simulate_hierarchicell function, it is important to run the filter_counts function followed by the compute_data_summaries function to build an R object that is in the right format for the following simulation function to properly work.

A data.frame of the simulated data or potentially a pdf of a tSNE plot (if tSNE_plot=TRUE).

Data should be only for cells of the specific cell-type you are interested in simulating or computing power for. Data should also contain as many unique sample identifiers as possible. If you are inputing data that has less than 5 unique values for sample identifier (i.e., independent experimental units), then the empirical estimation of the inter-individual heterogeneity is going to be very unstable. Finding such a dataset will be difficult at this time, but, over time (as experiments grow in sample size and the numbers of publically available single-cell RNAseq datasets increase), this should improve dramatically.

1
2
3

clean_expr_data <- filter_counts()
data_summaries <- compute_data_summaries(clean_expr_data)
simulated_counts <- simulate_hierarchicell_continuous(data_summaries)

kdzimm/hierarchicell documentation built on Dec. 21, 2021, 5:23 a.m.

kdzimm/hierarchicell index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

kdzimm/hierarchicell
Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data

simulate_hierarchicell_continuous: Simulate Expression Data for a Continuous Measure
In kdzimm/hierarchicell: Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data

Description

Usage

Arguments

Details

Value

Note

Examples

Related to simulate_hierarchicell_continuous in kdzimm/hierarchicell...

R Package Documentation

Browse R Packages

We want your feedback!

kdzimm/hierarchicell Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data

simulate_hierarchicell_continuous: Simulate Expression Data for a Continuous Measure In kdzimm/hierarchicell: Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data

Description

Usage

Arguments

Details

Value

Note

Examples

Related to simulate_hierarchicell_continuous in kdzimm/hierarchicell...

R Package Documentation

Browse R Packages

We want your feedback!

kdzimm/hierarchicell
Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data

simulate_hierarchicell_continuous: Simulate Expression Data for a Continuous Measure
In kdzimm/hierarchicell: Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data