simstudy: Perform a simulation study

Description Usage Arguments Value Examples

View source: R/simstudy.R

Description

This is a function that, given lists of parameters, performs a simulation study. It essentially condenses the simulation study by running dosim multiple times, once per parameter combination, and then saving the output of each dosim in a different element of a list.

Usage

1
2
3
4
simstudy(parameternames, nsims, seed, cellcounts, genecounts, xmeans,
  xsdss, ymeans, ysdss, propsbatch1, propsbatch2, mykeep = F,
  mycutoff = 5, mycore = 1, dgeneratedata = generatedata,
  ddocluster = docluster)

Arguments

parameternames

The names of the parameter combinations supplied. Used only for identification.

nsims

A list giving the number of simulations for each parameter setting.

seed

seed for reproducibility

cellcounts

A list giving the number of cells for each parameter setting.

genecounts

A list giving the number of genes for each parameter setting.

xmeans

A list giving the vectors of mean x-coordinates of the three cell types for each parameter setting.

xsdss

A list giving the vectors of the standard deviations of x-coordinates of the three cell types for each parameter setting

ymeans

A list giving the vectors of mean y-coordinates of the three cell types for each parameter setting.

ysdss

A list giving the vectors of the standard deviations of y-coordinates of the three cell types for each parameter setting

propsbatch1

A list giving the vectors of proportions of the three cell types in the first batch for each parameter setting.

propsbatch2

A list giving the vectors of proportions of the three cell types in the second batch for each parameter setting.

mykeep

Whether or not to keep "bad" simulation replicates where a cell type is represented by less than a certain number of cells in a batch: see mycutoff. By default =F.

mycutoff

A number: if the number of cells for any cell type is represented by fewer than cutoff cells in a simulated batch, then this simulation replicate is deemed to be of bad quality. By default=5.

mycore

The number of computing cores to use for parallelizing the simulation.

dgeneratedata

The function to use for generating data. By default this equals generatedata. Highly recommended not to modify this argument.

ddocluster

The function to use for clustering data. By default this equals docluster. Highly recommended not to modify this argument.

Value

A list of simulation results. Each element of the list contains simulations results (output of dosim) for one of the supplied combinations of parameters.

As mentioned in the documentation of dosim, the simulation results of a single simulation consist of the following list of simulation items:

Uncorrected Silhoutte Scores

A matrix of clustering results from simulation replicates for the uncorrected batch correction setting. Each row is a vector of four mean silhoutte scores from a single replicate. This vector consists of: mean silhoutte score of all cells, mean silhoutte score of cells from type 1, mean silhoutte score of cells from type 2, and mean silhoutte score of cells from type 3.

MNN Silhoutte Scores

A matrix of clustering results from simulation replicates for the MNN batch correction setting. Each row is a vector of four mean silhoutte scores from a single replicate. This vector consists of: mean silhoutte score of all cells, mean silhoutte score of cells from type 1, mean silhoutte score of cells from type 2, and mean silhoutte score of cells from type 3.

limma Silhoutte Scores

A matrix of clustering results from simulation replicates for the limma batch correction setting. Each row is a vector of four mean silhoutte scores from a single replicate. This vector consists of: mean silhoutte score of all cells, mean silhoutte score of cells from type 1, mean silhoutte score of cells from type 2, and mean silhoutte score of cells from type 3.

combat Silhoutte Scores

A matrix of clustering results from simulation replicates for the combat batch correction setting. Each row is a vector of four mean silhoutte scores from a single replicate. This vector consists of: mean silhoutte score of all cells, mean silhoutte score of cells from type 1, mean silhoutte score of cells from type 2, and mean silhoutte score of cells from type 3.

Replicate Runtimes

A vector of real-times (in seconds) necessary to complete each replicate.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
## Not run: 
parameternames=list('Original', 'Smaller Differences', 'More Genes')
nsims=list(50,50,50)
seed=0
cellcounts=list(500,500,500)
genecounts=list(100,100,5000)
xmeans=list(c(0,5,5),c(0,2,2),c(0,5,5))
xsdss=list(c(1,0.1,1),c(1,0.1,1),c(1,0.1,1))
ymeans=list(c(5,5,0),c(2,2,0),c(5,5,0))
ysdss=list(c(1,0.1,1),c(1,0.1,1),c(1,0.1,1))
propsbatch1=list(c(0.3,0.5,0.2),c(0.3,0.5,0.2),c(0.3,0.5,0.2))
propsbatch2=list(c(0.65,0.3,0.05),c(0.65,0.3,0.05),c(0.65,0.3,0.05))

simstudy(
   parameternames=parameternames,
   nsims=nsims,seed=seed,
   cellcounts=cellcounts,
   genecounts=genecounts,
   xmeans=xmeans,xsdss=xsdss,ymeans=ymeans,
   ysdss=ysdss,propsbatch1=propsbatch1,
   propsbatch2=propsbatch2,mycore=1
)

## End(Not run)

jlakkis/mnnsim documentation built on Jan. 21, 2020, 1:21 a.m.