pulsar: pulsar: serial or parallel mode

Description Usage Arguments Details Value References See Also Examples

View source: R/mcPulsarSelect.R

Description

Run pulsar using StARS' edge stability (or other criteria) to select an undirected graphical model over a lambda path.

Usage

1
2
3
pulsar(data, fun = huge::huge, fargs = list(), criterion = c("stars"),
  thresh = 0.1, subsample.ratio = NULL, rep.num = 20, seed = NULL,
  lb.stars = FALSE, ub.stars = FALSE, ncores = 1)

Arguments

data

A n*p matrix of data matrix input to solve for the p*p graphical model

fun

pass in a function that returns a list representing p*p sparse, undirected graphical models along the desired regularization path. The expected inputs to this function are: a data matrix input and a sequence of decreasing lambdas and must return a list or S3 object with a member named path. This should be a list of adjacency matrices for each value of lambda. See pulsar-function for more information.

fargs

arguments to argument fun. Must be a named list and requires at least one member lambda, a numeric vector with values for the penality parameter.

criterion

A character vector of selection statistics. Multiple criteria can be supplied. Only StARS can be used to automatically select an optimal index for the lambda path. See details for additional statistics.

thresh

threshold (referred to as scalar β in StARS publication) for selection criterion. Only implemented for StARS. thresh=0.1 is recommended.

subsample.ratio

determine the size of the subsamples (referred to as b(n)/n). Default is 10*sqrt(n)/n for n > 144 or 0.8 otherwise. Should be strictly less than 1.

rep.num

number of random subsamples N to take for graph re-estimation. Default is N=20, but more is recommended for non-StARS criteria or if using edge frequencies as confidence scores.

seed

A numeric seed to force predictable subsampling. Default is NULL. Use for testing purposes only.

lb.stars

Should the lower bound be computed after the first N=2 subsamples (should result in considerable speedup and only implemented if stars is selected). If this option is selected, other summary metrics will only be applied to the smaller lambda path.

ub.stars

Should the upper bound be computed after the first N=2 subsamples (should result in considerable speedup and only implemented if stars is selected). If this option is selected, other summary metrics will only be applied to the smaller lambda path. This option is ignored if the lb.stars flag is FALSE.

ncores

number of cores to use for subsampling. See batch.pulsar for more paralellization options.

Details

The options for criterion statistics are:

Value

an S3 object of class pulsar with a named member for each stability metric run. Within each of these are:

If stars is included as a criterion then additional arguments include

call: the original function call

References

M<c3><bc>ller, C. L., Bonneau, R., & Kurtz, Z. (2016). Generalized Stability Approach for Regularized Graphical Models. arXiv. http://arxiv.org/abs/1605.07072

Liu, H., Roeder, K., & Wasserman, L. (2010). Stability approach to regularization selection (stars) for high dimensional graphical models. Proceedings of the Twenty-Third Annual Conference on Neural Information Processing Systems (NIPS).

Zhao, T., Liu, H., Roeder, K., Lafferty, J., & Wasserman, L. (2012). The huge Package for High-dimensional Undirected Graph Estimation in R. The Journal of Machine Learning Research, 13, 1059<e2><80><93>1062.

See Also

batch.pulsar

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## Not run: 
## Generate the data with huge:
library(huge)
p <- 40 ; n <- 1200
dat   <- huge.generator(n, p, "hub", verbose=FALSE, v=.1, u=.3)
lams  <- getLamPath(getMaxCov(dat$data), .01, len=20)

## Run pulsar with huge
hugeargs <- list(lambda=lams, verbose=FALSE)
out.p <- pulsar(dat$data, fun=huge::huge, fargs=hugeargs,
                rep.num=20, criterion='stars')

## Run pulsar in bounded stars mode and include gcd metric:
out.b <- pulsar(dat$data, fun=huge::huge, fargs=hugeargs,
                rep.num=20, criterion=c('stars', 'gcd'),
                lb.stars=TRUE, ub.stars=TRUE)
plot(out.b)

## End(Not run)

pulsar documentation built on May 29, 2017, 12:29 p.m.