SparseDC_simulation: Simulate Datasets by SparseDC

View source: R/22-SparseDC.R

SparseDC_simulationR Documentation

Simulate Datasets by SparseDC

Description

This function is used to simulate datasets from learned parameters by sparseDCSimulate function in Splatter package.

Usage

SparseDC_simulation(
  parameters,
  other_prior = NULL,
  return_format,
  verbose = FALSE,
  seed
)

Arguments

parameters

A object generated by splatter::sparseDCEstimate()

other_prior

A list with names of certain parameters. Some methods need extra parameters to execute the estimation step, so you must input them. In simulation step, the number of cells, genes, groups, batches, the percent of DEGs are usually customed, so before simulating a dataset you must point it out. See Details below for more information.

return_format

A character. Alternative choices: list, SingleCellExperiment, Seurat, h5ad. If you select h5ad, you will get a path where the .h5ad file saves to.

verbose

Logical. Whether to return messages or not.

seed

A random seed.

Details

In SparseDC, users can only set nCells and nGenes to specify the number of cells and genes. But note that the total cell number is equal to nCells multiplies nclusters in estimation step that users defined (nclusters is 2 by default).

For more unusually used parameters and instructions, see Examples and splatter::SparseDCParams()

References

Barron M, Zhang S, Li J. A sparse differential clustering algorithm for tracing cell type changes via single-cell RNA-sequencing data. Nucleic acids research, 2018, 46(3): e14-e14. https://doi.org/10.1093/nar/gkx1113

CRAN URL: https://cran.rstudio.com/web/packages/SparseDC/index.html

Examples

## Not run: 
ref_data <- SingleCellExperiment::counts(scater::mockSCE())
## cell groups
set.seed(111)
group_condition <- sample(1:2, ncol(ref_data), replace = TRUE)
## estimation
estimate_result <- simmethods::SparseDC_estimation(
  ref_data = ref_data,
  other_prior = list(group.condition = group_condition),
  verbose = TRUE,
  seed = 111
)
## Note that SparseDC defines 2 clusters present in the dataset by default. Users
## can input other number if the estimation step failed.
estimate_result <- simmethods::SparseDC_estimation(
  ref_data = ref_data,
  other_prior = list(group.condition = group_condition,
                     nclusters = 3),
  verbose = TRUE,
  seed = 111
)

# 1) Simulate with default parameters
simulate_result <- simmethods::SparseDC_simulation(
  parameters = estimate_result[["estimate_result"]],
  other_prior = NULL,
  return_format = "list",
  verbose = TRUE,
  seed = 111
)
## counts
counts <- simulate_result[["simulate_result"]][["count_data"]]
dim(counts)

# 2) Simulate 1000 cells and 2000 genes
## Note that SparseDC defines 2 clusters present in the dataset by default. So we
## just only set nCells = 500.
length(estimate_result[["estimate_result"]]@clusts.c1)
simulate_result <- simmethods::SparseDC_simulation(
  parameters = estimate_result[["estimate_result"]],
  other_prior = list(nCells = 500,
                     nGenes = 2000),
  return_format = "list",
  verbose = TRUE,
  seed = 111
)

## counts
counts <- simulate_result[["simulate_result"]][["count_data"]]
dim(counts)

## End(Not run)


duohongrui/simmethods documentation built on June 17, 2024, 10:49 a.m.