powsimR_simulation: Simulate Datasets by powsimR
In duohongrui/simmethods: A collection of 40+ simulation methods for singlce-cell RNA seqencing data

powsimR_simulation

R Documentation

Simulate Datasets by powsimR

Description

This function is used to simulate datasets from learned parameters by simulateDE function in powsimR package.

Usage

powsimR_simulation(
  parameters,
  other_prior = NULL,
  return_format,
  verbose = FALSE,
  seed
)

Arguments

`parameters`	A object generated by `powsimR::estimateParam()` or `powsimR::estimateSpike()`.
`other_prior`	A list with names of certain parameters. Some methods need extra parameters to execute the estimation step, so you must input them. In simulation step, the number of cells, genes, groups, batches, the percent of DEGs are usually customed, so before simulating a dataset you must point it out. See `Details` below for more information.
`return_format`	A character. Alternative choices: list, SingleCellExperiment, Seurat, h5ad. If you select `h5ad`, you will get a path where the .h5ad file saves to.
`verbose`	Logical. Whether to return messages or not.
`seed`	A random seed.

Details

powsimR provides some choices for users to select suitable parameters according to different normalization methods, and methods for differential expressed analysis.

Normalisation. "TMM" (default), "MR", "PosCounts", "UQ", "scran", "Linnorm", "sctransform", "SCnorm", "Census", "depth".
DEmethod. "T-Test", "edgeR-LRT", "edgeR-QL", "edgeR-zingeR", "edgeR-ZINB-WaVE", "limma-voom", "limma-trend" (default), "DESeq2", "DESeq2-zingeR", "DESeq2-ZINB-WaVE", "ROTS", "baySeq", "NOISeq", "EBSeq", "MAST", "BPSC", "scDD", "DECENT".

In addtion to simulate datasets with default parameters, users want to simulate other kinds of datasets, e.g. a counts matrix with 2 or more cell groups. In powsimR, you can set extra parameters to simulate datasets.

The customed parameters you can set are below:

nCells. In powsimR, You can directly set other_prior = list(nCells = 5000) to simulate 5000 cells.
nGenes. You can directly set other_prior = list(nGenes = 5000) to simulate 5000 genes.
de.prob. You can directly set other_prior = list(de.prob = 0.2) to simulate DEGs that account for 20 percent of all genes.
prob.group. You can directly set other_prior = list(prob.group = c(0.4, 0.6)) to assign two proportions of cell groups. Note that the the length of the vector must be 2.
fc.group. You can directly set other_prior = list(fc.group = 2) to specify the fold change of DEGs.
nBatches. You can not directly set other_prior = list(nBatches = 3) to simulate 3 batches. Instead, you should set other_prior = list(prob.batch = c(0.3, 0.4, 0.3)) to reach the goal.
fc.batch. You can directly set other_prior = list(fc.batch = 2) to specify the fold change of genes between batches.

For more customed parameters in powsimR, please check powsimR::simulateDE().

References

Vieth B, Ziegenhain C, Parekh S, et al. powsimR: power analysis for bulk and single cell RNA-seq experiments. Bioinformatics, 2017, 33(21): 3486-3488. https://doi.org/10.1093/bioinformatics/btx435

Github URL: https://github.com/bvieth/powsimR

Examples

## Not run: 
ref_data <- simmethods::data

## Estimate parameters with ERCC spike-in
### Make sure there are ERCC names in reference data
rownames(ref_data)[grep(rownames(ref_data), pattern = "^ERCC")]
### Users must input the dilution.factor and volume (nanoliter) to determine the ERCC molecules
estimate_result <- powsimR_estimation(
  ref_data = ref_data,
  other_prior = list(RNAseq = "singlecell",
                     Protocol = "UMI",
                     Normalisation = "scran",
                     dilution.factor = 50000,
                     volume = 1),
  verbose = TRUE,
  seed = 111)

# (1) Simulate 500 cells and 2000 genes
simulate_result <- simmethods::powsimR_simulation(
  parameters = estimate_result[["estimate_result"]],
  other_prior = list(nCells = 500,
                     nGenes = 2000),
  return_format = "list",
  verbose = TRUE,
  seed = 111)
count_data <- simulate_result[["simulate_result"]][["count_data"]]
dim(count_data)
## In powsimR, users always get two groups of cells, and the numbers of cells in
## different groups are the same (default, prob.group = c(0.5, 0.5)).
col_data <- simulate_result[["simulate_result"]][["col_meta"]]
table(col_data$group)
## row_data
row_data <- simulate_result[["simulate_result"]][["row_meta"]]
table(row_data$de_gene)[2]/2000

# (2) Simulate two groups with 20% of DEGs and 4 fold change.
simulate_result <- simmethods::powsimR_simulation(
  parameters = estimate_result[["estimate_result"]],
  other_prior = list(prob.group = c(0.3, 0.7),
                     de.prob = 0.2,
                     fc.group = 4),
  return_format = "list",
  verbose = TRUE,
  seed = 111)
count_data <- simulate_result[["simulate_result"]][["count_data"]]
dim(count_data)
## cell information
col_data <- simulate_result[["simulate_result"]][["col_meta"]]
table(col_data$group)
## gene information
row_data <- simulate_result[["simulate_result"]][["row_meta"]]
table(row_data$de_gene)[2]/4000


# (3) Simulate two batches (2 fold change)
simulate_result <- simmethods::powsimR_simulation(
  parameters = estimate_result[["estimate_result"]],
  other_prior = list(prob.batch = c(0.4, 0.6),
                     fc.batch = 2),
  return_format = "list",
  verbose = TRUE,
  seed = 111)
count_data <- simulate_result[["simulate_result"]][["count_data"]]
dim(count_data)
## cell information
col_data <- simulate_result[["simulate_result"]][["col_meta"]]
table(col_data$group)
table(col_data$batch)
## gene information
row_data <- simulate_result[["simulate_result"]][["row_meta"]]
head(row_data)
table(row_data$de_fc)
table(row_data$batch_genes)

## End(Not run)

duohongrui/simmethods documentation built on June 17, 2024, 10:49 a.m.

duohongrui/simmethods index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

duohongrui/simmethods
A collection of 40+ simulation methods for singlce-cell RNA seqencing data

powsimR_simulation: Simulate Datasets by powsimR
In duohongrui/simmethods: A collection of 40+ simulation methods for singlce-cell RNA seqencing data

Simulate Datasets by powsimR

Description

Usage

Arguments

Details

References

Examples

Related to powsimR_simulation in duohongrui/simmethods...

R Package Documentation

Browse R Packages

We want your feedback!

duohongrui/simmethods A collection of 40+ simulation methods for singlce-cell RNA seqencing data

powsimR_simulation: Simulate Datasets by powsimR In duohongrui/simmethods: A collection of 40+ simulation methods for singlce-cell RNA seqencing data

Simulate Datasets by powsimR

Description

Usage

Arguments

Details

References

Examples

Related to powsimR_simulation in duohongrui/simmethods...

R Package Documentation

Browse R Packages

We want your feedback!

duohongrui/simmethods
A collection of 40+ simulation methods for singlce-cell RNA seqencing data

powsimR_simulation: Simulate Datasets by powsimR
In duohongrui/simmethods: A collection of 40+ simulation methods for singlce-cell RNA seqencing data