ccdr.run: Main CCDr Algorithm
In ccdrAlgorithm: CCDr Algorithm for Learning Sparse Gaussian Bayesian Networks

View source: R/ccdrAlgorithm-main.R

ccdr.run

R Documentation

Main CCDr Algorithm

Description

Estimate a Bayesian network (directed acyclic graph) from observational data using the CCDr algorithm as described in Aragam and Zhou (2015).

Usage

ccdr.run(
  data,
  lambdas = NULL,
  lambdas.length = NULL,
  whitelist = NULL,
  blacklist = NULL,
  gamma = 2,
  error.tol = 1e-04,
  max.iters = NULL,
  alpha = 10,
  betas,
  sigmas = NULL,
  verbose = FALSE
)

Arguments

`data`	Data as `sparsebnData` object. Must be numeric and contain no missing values.
`lambdas`	Numeric vector containing a grid of lambda values (i.e. regularization parameters) to use in the solution path. If missing, a default grid of values will be used based on a decreasing log-scale (see also generate.lambdas).
`lambdas.length`	Integer number of values to include in the solution path. If `lambdas` has also been specified, this value will be ignored. Note also that the final solution path may contain fewer estimates (see `alpha`).
`whitelist`	A two-column matrix of edges that are guaranteed to be in each estimate (a "white list"). Each row in this matrix corresponds to an edge that is to be whitelisted. These edges can be specified by node name (as a `character` matrix), or by index (as a `numeric` matrix).
`blacklist`	A two-column matrix of edges that are guaranteed to be absent from each estimate (a "black list"). See argument "`whitelist`" above for more details.
`gamma`	Value of concavity parameter. If `gamma > 0`, then the MCP will be used with `gamma` as the concavity parameter. If `gamma < 0`, then the L1 penalty will be used and this value is otherwise ignored.
`error.tol`	Error tolerance for the algorithm, used to test for convergence.
`max.iters`	Maximum number of iterations for each internal sweep.
`alpha`	Threshold parameter used to terminate the algorithm whenever the number of edges in the current DAG estimate is `> alpha * ncol(data)`.
`betas`	Initial guess for the algorithm. Represents the weighted adjacency matrix of a DAG where the algorithm will begin searching for an optimal structure.
`sigmas`	Numeric vector of known values of conditional variances for each node in the network. If this is set by the user, these parameters will not be computed and the input will be used as the "true" values of the variances in the algorithm. Note that setting this to be all ones (i.e. `sigmas[j] = 1` for all `j`) is equivalent to using the least-squares loss.
`verbose`	`TRUE / FALSE` whether or not to print out progress and summary reports.

Details

Instead of producing a single estimate, this algorithm computes a solution path of estimates based on the values supplied to lambdas or lambdas.length. The CCDr algorithm approximates the solution to a nonconvex optimization problem using coordinate descent. Instead of AIC or BIC, CCDr uses continuous regularization based on concave penalties such as the minimax concave penalty (MCP).

This implementation includes two options for the penalty: (1) MCP, and (2) L1 (or Lasso). This option is controlled by the gamma argument.

Value

A sparsebnPath object.

Examples


### Generate some random data
dat <- matrix(rnorm(1000), nrow = 20)
dat <- sparsebnUtils::sparsebnData(dat, type = "continuous")

# Run with default settings
ccdr.run(data = dat, lambdas.length = 20)

### Optional: Adjust settings
pp <- ncol(dat$data)

# Initialize algorithm with a random initial value
init.betas <- matrix(0, nrow = pp, ncol = pp)
init.betas[1,2] <- init.betas[1,3] <- init.betas[4,2] <- 1

# Run with adjusted settings
ccdr.run(data = dat, betas = init.betas, lambdas.length = 20, alpha = 10, verbose = TRUE)

ccdrAlgorithm documentation built on April 12, 2022, 9:06 a.m.