View source: R/batchPulsarSelect.R
batch.pulsar | R Documentation |
Run pulsar using stability selection, or another criteria, to select an undirected graphical model over a lambda-path.
batch.pulsar(
data,
fun = huge::huge,
fargs = list(),
criterion = c("stars"),
thresh = 0.1,
subsample.ratio = NULL,
lb.stars = FALSE,
ub.stars = FALSE,
rep.num = 20,
seed = NULL,
wkdir = getwd(),
regdir = NA,
init = "init",
conffile = "",
job.res = list(),
cleanup = FALSE,
refit = TRUE
)
data |
A |
fun |
pass in a function that returns a list representing |
fargs |
arguments to argument |
criterion |
A character vector of selection statistics. Multiple criteria can be supplied. Only StARS can be used to automatically select an optimal index for the lambda path. See details for additional statistics. |
thresh |
threshold (referred to as scalar |
subsample.ratio |
determine the size of the subsamples (referred to as |
lb.stars |
Should the lower bound be computed after the first |
ub.stars |
Should the upper bound be computed after the first |
rep.num |
number of random subsamples |
seed |
A numeric seed to force predictable subsampling. Default is NULL. Use for testing purposes only. |
wkdir |
set the working directory if different than |
regdir |
directory to store intermediate batch job files. Default will be a tempory directory |
init |
text string appended to basename of the regdir path to store the batch jobs for the initial StARS variability estimate (ignored if 'regdir' is NA) |
conffile |
path to or string that identifies a |
job.res |
named list of resources needed for each job (e.g. for PBS submission script). The format and members depends on configuration and template. See examples section for a Torque example |
cleanup |
Flag for removing batchtools registry files. Recommended FALSE unless you're sure intermediate data shouldn't be saved. |
refit |
Boolean flag to refit on the full dataset after pulsar is run. (see also |
an S3 object of class batch.pulsar
with a named member for each stability criterion/metric. Within each of these are:
summary: the summary criterion over rep.num
graphs at each value of lambda
criterion: the stability metric
merge: the raw criterion merged over the rep.num
graphs (constructed from rep.num
subsamples), prior to summarization
opt.ind: index (along the path) of optimal lambda selected by the criterion at the desired threshold. Will return 0
if no optimum is found or NULL
if selection for the criterion is not implemented.
If stars
is included as a criterion then additional arguments include
lb.index: the lambda index of the lower bound at N=2
samples if lb.stars
flag is set to TRUE
ub.index: the lambda index of the upper bound at N=2
samples if ub.stars
flag is set to TRUE
reg: Registry object. See batchtools::makeRegistry
id: Identifier for mapping graph estimation function. See batchtools::batchMap
call: the original function call
Müller, C. L., Bonneau, R., & Kurtz, Z. (2016). Generalized Stability Approach for Regularized Graphical Models. arXiv https://arxiv.org/abs/1605.07072
Liu, H., Roeder, K., & Wasserman, L. (2010). Stability approach to regularization selection (stars) for high dimensional graphical models. Proceedings of the Twenty-Third Annual Conference on Neural Information Processing Systems (NIPS).
Zhao, T., Liu, H., Roeder, K., Lafferty, J., & Wasserman, L. (2012). The huge Package for High-dimensional Undirected Graph Estimation in R. The Journal of Machine Learning Research, 13, 1059–1062.
Michel Lang, Bernd Bischl, Dirk Surmann (2017). batchtools: Tools for R to work on batch systems. The Journal of Open Source Software, 2(10). URL https://doi.org/10.21105/joss.00135.
pulsar
refit
## Not run:
## Generate the data with huge:
library(huge)
set.seed(10010)
p <- 400 ; n <- 1200
dat <- huge.generator(n, p, "hub", verbose=FALSE, v=.1, u=.3)
lams <- getLamPath(.2, .01, len=40)
hugeargs <- list(lambda=lams, verbose=FALSE)
## Run batch.pulsar using snow on 5 cores, and show progress.
options(mc.cores=5)
options(batchtools.progress=TRUE, batchtools.verbose=FALSE)
out <- batch.pulsar(dat$data, fun=huge::huge, fargs=hugeargs,
rep.num=20, criterion='stars', conffile='snow')
## Run batch.pulsar on a Torque cluster
## Give each job 1gb of memory and a limit of 30 minutes
resources <- list(mem="1GB", nodes="1", walltime="00:30:00")
out.p <- batch.pulsar(dat$data, fun=huge::huge, fargs=hugeargs,
rep.num=100, criterion=c('stars', 'gcd'), conffile='torque'
job.res=resources, regdir=file.path(getwd(), "testtorq"))
plot(out.p)
## take a look at the default torque config and template files we just used
file.show(findConfFile('torque'))
file.show(findTemplateFile('simpletorque'))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.