CD.run: cd.run

Description Usage Arguments Details Value Examples

Description

Structure learning of discrete Bayesian network

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
cd.run(
  indata,
  weights = NULL,
  lambdas = NULL,
  lambdas.length = 30,
  whitelist = NULL,
  blacklist = NULL,
  error.tol = 1e-04,
  convLb = 0.01,
  weight.scale = 1,
  upperbound = 100,
  alpha = 3,
  permute = FALSE,
  adaptive = FALSE
)

Arguments

indata

A sparsebnData object.

weights

Weight matrix. Weight can be the l_2 norm of a consistent estimate of beta_{j.i}. See paper Gu et al. (2016) chapter 3.3 for more details. A weight matrix that is set improperly may cause convergence issues and lead to a suboptimal solution.

lambdas

Numeric vector containing a grid of lambda values (i.e. regularization parameters) to use in the solution path. If missing, a default grid of values will be used based on a decreasing log-scale. To generate a sequence of lambdas see generate.lambdas. For discrete network, the paper provided a way to calculate a maximum lambda that penalizes all parameters to zero, Gu et al. (2016) chapter 3.4. See function max_lambda for details.

lambdas.length

Integer number of values to include in the solution path.

whitelist

A two-column matrix of edges that are guaranteed to be in each estimate (a "white list"). Each row in this matrix corresponds to an edge that is to be whitelisted. These edges can be specified by node name (as a character matrix), or by index (as a numeric matrix).

blacklist

A two-column matrix of edges that are guaranteed to be absent from each estimate (a "black list"). See argument "whitelist" above for more details.

error.tol

Error tolerance for the algorithm, used to test for convergence.

convLb

Small positive number used in Hessian approximation.

weight.scale

A positive number to scale weight matrix.

upperbound

A large positive value used to truncate the adaptive weights. A -1 value indicates that there is no truncation.

alpha

Threshold parameter used to terminate the algorithm whenever the number of edges in the current DAG estimate is > alpha * ncol(data).

permute

A bool parameter, default value is FALSE. If TRUE, will randomize order of going through blocks.

adaptive

A bool parameter, default value is FALSE. If FALSE, a regular lasso algorithm will be run. If TRUE, an adaptive lasso algorithm will be run.

Details

Estimate structure of a discrete Bayesian network from observational/interventional data using the CD algorithm described in Gu et al. (2016).

Instead of producing a single estimate, this algorithm computes a solution path of estimates based on the values supplied to lambdas or lambdas.length. This package do not provide a model selection method in this version, users can choose their own model selection criterion. In later version of this package we will provide an empirical model selection method.

This package can handle interventional data by input a list of intervention. See example for more detail.

Value

A sparsebnPath object. The CD Algorithm will be stopped if the number of edges exceeds 3 times of number of variables.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
## Not run: 

### Generate some random data
dat <- matrix(rbinom(200, size = 3, prob = 0.4), nrow = 20)
# for observational data
dat_obs <- sparsebnUtils::sparsebnData(dat, type = "discrete")
# for interventional data
data_size <- nrow(dat)
ivn <- lapply(1:data_size, function(x){return(as.integer(x/10))})
# if there is no intervention for an observation, use 0.
# cd algorithm can handle multiple interventions for a single observation.
dat_int <- sparsebnUtils::sparsebnData(dat, ivn = ivn, type = "discrete")

# Run with default settings for observational data
cd.run(indata = dat_obs)
# Run with default settings for interventional data
cd.run(indata = dat_int)
# Run adaptive algorithm for observational data
cd.run(indata = dat_obs, adaptive = TRUE)

### Optional: Adjust settings
n_node <- ncol(dat)

# Run algorithm with a given weight
# Careful with this option.
weights <- matrix(1, nrow = n_node, ncol = n_node)

# Run with adjusted settings
cd.run(indata = dat_obs, weights = weights, lambdas.length = 10)

# Note: Normally, users do not need to change default settings.

## End(Not run)

gujyjean/discretecdAlgorithm documentation built on March 15, 2020, 7:32 p.m.