# CD.run: cd.run In gujyjean/discretecdAlgorithm: Coordinate-Descent Algorithm for Learning Sparse Discrete Bayesian Networks

## Description

Structure learning of discrete Bayesian network

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15``` ```cd.run( indata, weights = NULL, lambdas = NULL, lambdas.length = 30, whitelist = NULL, blacklist = NULL, error.tol = 1e-04, convLb = 0.01, weight.scale = 1, upperbound = 100, alpha = 3, permute = FALSE, adaptive = FALSE ) ```

## Arguments

 `indata` A sparsebnData object. `weights` Weight matrix. Weight can be the `l_2` norm of a consistent estimate of `beta_{j.i}`. See paper Gu et al. (2016) chapter 3.3 for more details. A weight matrix that is set improperly may cause convergence issues and lead to a suboptimal solution. `lambdas` Numeric vector containing a grid of lambda values (i.e. regularization parameters) to use in the solution path. If missing, a default grid of values will be used based on a decreasing log-scale. To generate a sequence of lambdas see `generate.lambdas`. For discrete network, the paper provided a way to calculate a maximum lambda that penalizes all parameters to zero, Gu et al. (2016) chapter 3.4. See function `max_lambda` for details. `lambdas.length` Integer number of values to include in the solution path. `whitelist` A two-column matrix of edges that are guaranteed to be in each estimate (a "white list"). Each row in this matrix corresponds to an edge that is to be whitelisted. These edges can be specified by node name (as a `character` matrix), or by index (as a `numeric` matrix). `blacklist` A two-column matrix of edges that are guaranteed to be absent from each estimate (a "black list"). See argument "`whitelist`" above for more details. `error.tol` Error tolerance for the algorithm, used to test for convergence. `convLb` Small positive number used in Hessian approximation. `weight.scale` A positive number to scale weight matrix. `upperbound` A large positive value used to truncate the adaptive weights. A -1 value indicates that there is no truncation. `alpha` Threshold parameter used to terminate the algorithm whenever the number of edges in the current DAG estimate is `> alpha * ncol(data)`. `permute` A bool parameter, default value is FALSE. If TRUE, will randomize order of going through blocks. `adaptive` A bool parameter, default value is FALSE. If FALSE, a regular lasso algorithm will be run. If TRUE, an adaptive lasso algorithm will be run.

## Details

Estimate structure of a discrete Bayesian network from observational/interventional data using the CD algorithm described in Gu et al. (2016).

Instead of producing a single estimate, this algorithm computes a solution path of estimates based on the values supplied to `lambdas` or `lambdas.length`. This package do not provide a model selection method in this version, users can choose their own model selection criterion. In later version of this package we will provide an empirical model selection method.

This package can handle interventional data by input a list of intervention. See example for more detail.

## Value

A `sparsebnPath` object. The CD Algorithm will be stopped if the number of edges exceeds 3 times of number of variables.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33``` ```## Not run: ### Generate some random data dat <- matrix(rbinom(200, size = 3, prob = 0.4), nrow = 20) # for observational data dat_obs <- sparsebnUtils::sparsebnData(dat, type = "discrete") # for interventional data data_size <- nrow(dat) ivn <- lapply(1:data_size, function(x){return(as.integer(x/10))}) # if there is no intervention for an observation, use 0. # cd algorithm can handle multiple interventions for a single observation. dat_int <- sparsebnUtils::sparsebnData(dat, ivn = ivn, type = "discrete") # Run with default settings for observational data cd.run(indata = dat_obs) # Run with default settings for interventional data cd.run(indata = dat_int) # Run adaptive algorithm for observational data cd.run(indata = dat_obs, adaptive = TRUE) ### Optional: Adjust settings n_node <- ncol(dat) # Run algorithm with a given weight # Careful with this option. weights <- matrix(1, nrow = n_node, ncol = n_node) # Run with adjusted settings cd.run(indata = dat_obs, weights = weights, lambdas.length = 10) # Note: Normally, users do not need to change default settings. ## End(Not run) ```

gujyjean/discretecdAlgorithm documentation built on March 15, 2020, 7:32 p.m.