simulateSet  R Documentation 
Simulation of a complete dataset, where the number of each type of differential distributions and equivalent distributions is specified.
simulateSet(SCdat, numSamples = 100, nDE = 250, nDP = 250, nDM = 250, nDB = 250, nEE = 5000, nEP = 4000, sd.range = c(1, 3), modeFC = c(2, 3, 4), plots = TRUE, plot.file = NULL, random.seed = 284, varInflation = NULL, condition = "condition", param = bpparam())
SCdat 
An object of class 
numSamples 
numeric value for the number of samples in each condition to simulate 
nDE 
Number of DE genes to simulate 
nDP 
Number of DP genes to simulate 
nDM 
Number of DM genes to simulate 
nDB 
Number of DB genes to simulate 
nEE 
Number of EE genes to simulate 
nEP 
Number of EP genes to simulate 
sd.range 
Numeric vector of length two which describes the interval (lower, upper) of standard deviations of fold changes to randomly select. 
modeFC 
Vector of values to use for fold changes between modes for DP, DM, and DB. 
plots 
Logical indicating whether or not to generate fold change and validation plots 
plot.file 
Character containing the file string if the plots are to be sent to a pdf instead of to the standard output. 
random.seed 
Numeric value for a call to 
varInflation 
Optional numeric vector with one element for each condition that corresponds to the multiplicative variance inflation factor to use when simulating data. Useful for sensitivity studies to assess the impact of confounding effects on differential variance across conditions. Currently assumes all samples within a condition are subject to the same variance inflation factor. 
condition 
A character object that contains the name of the column in

param 
a 
An object of class SingleCellExperiment
that contains
simulated singlecell expression and metadata. The assays
slot contains a named list of matrices, where the simulated counts are
housed in the one named normcounts
. This matrix should have one
row for each gene (nDE + nDP + nDM + nDB + nEE
+ nEP
rows) and one sample for each column (numSamples
columns).
The colData
slot contains a data.frame with one row per
sample and a column that represents biological condition, which is
in the form of numeric values (either 1 or 2) that indicates which
condition each sample belongs to (in the same order as the columns of
normcounts
). The rowData
slot contains information about the
category of the gene (EE, EP, DE, DM, DP, or DB), as well as the simulated
foldchange value.
Korthauer KD, Chu LF, Newton MA, Li Y, Thomson J, Stewart R, Kendziorski C. A statistical approach for identifying differential distributions in singlecell RNAseq experiments. Genome Biology. 2016 Oct 25;17(1):222. https://genomebiology.biomedcentral.com/articles/10.1186/s130590161077y
# Load toy example ExpressionSet to simulate from data(scDatEx) # check that this object is a member of the ExpressionSet class # and that it contains 142 samples and 500 genes class(scDatEx) show(scDatEx) # set arguments to pass to simulateSet function # we will simuate 30 genes total; 5 genes of each type; # and 100 samples in each of two conditions nDE < 5 nDP < 5 nDM < 5 nDB < 5 nEE < 5 nEP < 5 numSamples < 100 seed < 816 # create simulated set with specified numbers of DE, DP, DM, DM, EE, and # EP genes, # specified number of samples, DE genes are 2 standard deviations apart, and # multimodal genes have modal distance of 4 standard deviations SD < simulateSet(scDatEx, numSamples=numSamples, nDE=nDE, nDP=nDP, nDM=nDM, nDB=nDB, nEE=nEE, nEP=nEP, sd.range=c(2,2), modeFC=4, plots=FALSE, random.seed=seed)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.