scGAN_estimation: Estimate Parameters From Real Datasets by scGAN

View source: R/38-scGAN.R

scGAN_estimationR Documentation

Estimate Parameters From Real Datasets by scGAN

Description

This function is used to estimate useful parameters in a Docker container.

Usage

scGAN_estimation(ref_data, other_prior = NULL, verbose = FALSE, seed)

Arguments

ref_data

A count matrix. Each row represents a gene and each column represents a cell.

other_prior

A list with names of certain parameters. Some methods need extra parameters to execute the estimation step, so you must input them. In simulation step, the number of cells, genes, groups, batches, the percent of DEGs and other variables are usually customed, so before simulating a dataset you must point it out.

verbose

Logical.

seed

An integer of a random seed.

Details

scGAN is a novel method to simulate single-cell RNA-seq datasets using generative adversarial neural networks and users can only execute it via docker images. scGAN_estimation and scGAN_simulation functions have already implemented the codes that users can use scGAN in R environment. There are some notes that users should know:

  1. Please install docker on you device or remote service.

  2. The estimation step may take a long time as scGAN trains data reference data via neural networks.

  3. The result of estimation will be returned as a file path which is the mounting point to connect the path in docker containers. Users can go to the mounting point to see the training result.

There are some parameters that users may often set:

  1. group.condition. Users can input cell group information of numeric vectors. If not, clustering will be performed before the estimation step.

  2. max_steps. The max training step to train the reference data. Default is 1000000.

  3. GPU. How many GPU cores to use when training the data. This can be set as all. Default is 1.

  4. min_cells. Include features detected in at least this many cells when preprocessing.

  5. min_genes. Include cells where at least this many features are detected when preprocessing.

  6. res. The clustering resolution. Default is 0.15.

Value

A list contains the estimated parameters and the results of execution detection.

References

Marouf M, Machart P, Bansal V, et al. Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks. Nature communications, 2020, 11(1): 1-12.


duohongrui/simmethods documentation built on June 17, 2024, 10:49 a.m.