Description Usage Arguments Value Author(s) References Examples
Simulate data based on input simulation parameters. Size factors are custom input or simulated from N(1,0.25)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | simulateData(
K = 2,
B = 1,
g = 10000,
n = 50,
pK = NULL,
pB = NULL,
LFCg = 1,
pDEg = 0.05,
sigma_g = 0.1,
LFCb = 1,
pDEb = 0.5,
sigma_b = 0,
beta0 = 12,
phi0 = 0.35,
SF = NULL,
nsims = 25,
disp = "gene",
n_pred = 25,
sim_batch_pred = FALSE,
LFCb_pred = NULL,
save_file = TRUE,
save_dir = NULL,
save_pref = NULL
)
|
K |
integer, number of clusters |
B |
integer, number of batches |
g |
integer, number of genes |
n |
integer, number of samples |
pK |
vector of length K (optional): proportion of samples in each cluster |
pB |
vector of length B (optional): proportion of samples in each batch |
LFCg |
numeric, LFC for cluster-discriminatory genes |
pDEg |
numeric, proportion of genes that are cluster-discriminatory |
sigma_g |
numeric, Gaussian noise added to each gene/sample N(0,sigma_g). Default is 0.1 |
LFCb |
numeric, LFC for genes that are differentially expressed across batch. Default is 1. |
pDEb |
numeric, proportion of genes that are differentially expressed across batch. Default is 0.5. |
sigma_b |
numeric, batch-specific Gaussian noise (default 0). |
beta0 |
numeric, baseline log2 expression for each gene before LFC is applied |
phi0 |
numeric, baseline overdispersion for each gene |
SF |
vector of length n (optional), custom size factors from DESeq2. If NULL, simulated from N(1,0.25) |
nsims |
integer, number of datasets to simulate given the input conditions. Default is 25. |
disp |
string, either 'gene' or 'cluster' to simulate gene-level or cluster-level dispersions. Default is gene-level. Input phi must be g x K matrix if disp='cluster' |
n_pred |
integer, number of samples in simulated prediction dataset. Default is 25 |
sim_batch_pred |
boolean: FALSE (no batch effect for prediction samples) or TRUE (batch effect) |
LFCb_pred |
LFCb for batch-affected genes in prediction set. By default (NULL), = max(batch_effects) + LFCb/2: larger batch effect than training. |
save_file |
boolean: TRUE (save each set of simulations) |
save_dir |
string (optional): directory to save files. Default: 'Simulations/<sigma_g>_<sigma_b>/B<B>' |
save_pref |
string (optional): prefix of file name to save simulated data to. Default: '<K>_<n>_<LFCg>_<pDEg>_<beta0>_<phi0>' |
if save_file=TRUE, then saved file in '<save_dir>/<save_pref>_sim<1:nsims>_data.RData'. Otherwise, list of length 'nsims', with a sim.dat list object for each simulation
David K. Lim, deelim@live.unc.edu
https://github.com/DavidKLim/FSCseq
1 | sim.dat = FSCseq::simulateData(B=1, g=10000, K=2, n=50, LFCg=1, pDEg=0.05, beta0=12, phi0=0.35, nsims=1, save_file=F)[[1]]
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.