hierarchicell_simulation: Simulate Datasets by hierarchicell

View source: R/26-hierarchicell.R

hierarchicell_simulationR Documentation

Simulate Datasets by hierarchicell

Description

This function is used to simulate datasets from learned parameters by simulate_hierarchicell function in hierarchicell package.

Usage

hierarchicell_simulation(
  parameters,
  other_prior = NULL,
  return_format,
  verbose = FALSE,
  seed
)

Arguments

parameters

A object generated by hierarchicell::compute_data_summaries()

other_prior

A list with names of certain parameters. Some methods need extra parameters to execute the estimation step, so you must input them. In simulation step, the number of cells, genes, groups, batches, the percent of DEGs are usually customed, so before simulating a dataset you must point it out. See Details below for more information.

return_format

A character. Alternative choices: list, SingleCellExperiment, Seurat, h5ad. If you select h5ad, you will get a path where the .h5ad file saves to.

verbose

Logical. Whether to return messages or not.

seed

A random seed.

Details

In hierarchicell, users can set nCells, nGenes and fc.group directly. There are some notes that users should know:

  1. hierarchicell can only simulate two groups.

  2. Some cells in the result may contain NA values across all genes due to the failing of GLM fitting.

  3. hierarchicell does not return the information of DEGs and we can not know which genes are DEGs.

For more information, see Examples and hierarchicell::simulate_hierarchicell

References

Zimmerman K D, Langefeld C D. Hierarchicell: an R-package for estimating power for tests of differential expression with single-cell data. BMC genomics, 2021, 22(1): 1-8. https://doi.org/10.1186/s12864-021-07635-w

Github URL: https://github.com/kdzimm/hierarchicell

Examples

## Not run: 
ref_data <- SingleCellExperiment::counts(scater::mockSCE())
## estimation
estimate_result <- simmethods::hierarchicell_estimation(
  ref_data = ref_data,
  other_prior = NULL,
  verbose = TRUE,
  seed = 111
)

# 1) Simulate with default parameters
simulate_result <- simmethods::hierarchicell_simulation(
  parameters = estimate_result[["estimate_result"]],
  other_prior = NULL,
  return_format = "list",
  verbose = TRUE,
  seed = 111
)
## counts
counts <- simulate_result[["simulate_result"]][["count_data"]]
dim(counts)

# 2) Customed parameters: cell and gene number, fold change of DEGs. (But hierarchicell
# does not tell us which genes are DEGs). Note that some cells may contain NA values
# across all genes due to the failing of GLM fitting.
simulate_result <- simmethods::hierarchicell_simulation(
  parameters = estimate_result[["estimate_result"]],
  other_prior = list(nCells = 2000,
                     nGenes = 4000,
                     fc.group = 4),
  return_format = "list",
  verbose = TRUE,
  seed = 111
)

## counts
counts <- simulate_result[["simulate_result"]][["count_data"]]
dim(counts)
## Remove NA cells
if(!requireNamespace("tidyr", quietly = TRUE)){utils::install.packages("tidyr")}
filter_counts <- as.matrix(t(tidyr::drop_na(as.data.frame(t(counts)))))
dim(filter_counts)

## End(Not run)


duohongrui/simmethods documentation built on June 17, 2024, 10:49 a.m.