MCEM.hcbn: Monte Carlo Expectation Maximization

View source: R/mcem_hcbn.R

MCEM.hcbnR Documentation

Monte Carlo Expectation Maximization

Description

parameter estimation for the hidden conjunctive Bayesian network model (H-CBN) via importance sampling

Usage

MCEM.hcbn(
  lambda,
  poset,
  obs,
  lambda.s = 1,
  L,
  eps = NULL,
  sampling = c("forward", "add-remove", "backward", "bernoulli", "pool"),
  times = NULL,
  weights = NULL,
  max.iter = 100L,
  update.step.size = 20L,
  tol = 0.001,
  max.lambda = 1e+06,
  neighborhood.dist = 1L,
  thrds = 1L,
  verbose = FALSE,
  seed = NULL
)

Arguments

lambda

a vector containing initial values for the rate parameters

poset

a matrix containing the cover relations

obs

a matrix containing observations or genotypes, where each row corresponds to a genotype vector whose entries indicate whether an event has been observed (1) or not (0)

lambda.s

rate of the sampling process. Defaults to 1.0

L

number of samples to be drawn from the proposal in the E-step

eps

an optional initial value of the error rate parameter

sampling

sampling scheme to generate hidden genotypes, X. OPTIONS: "forward" - generate occurrence times according to current rate parameters, and, from them, generate the hidden genotypes, X; "add-remove" - generate genotypes, X, from observed genotypes, Y using a two-steps proposal. First, pick a move uniformly at random: either to add or to remove an event. Events are chosen to be removed with probability proportional to their rates, and to be added with an inverse probability. Second, make genotypes compatible with the poset by either adding or removing all events incompatible with the poset; "backward" - enumerate all genotypes with Hamming distance k; "bernoulli" - generate genotypes from a Bernoulli distribution with success probability p = ε; "pool" - generate a pool of compatible genotypes according to current rate parameters and sample K observations proportional to their Hamming distance;

times

an optional vector containing times at which genotypes were observed

weights

an optional vector containing observation weights

max.iter

the maximum number of EM iterations. Defaults to 100 iterations

update.step.size

number of EM steps after which the number of samples, L, is doubled. L is increased, if the difference in the parameter estimates between such consecutive batches is greater than the tolerance level, tol

tol

convergence tolerance for the error rate and the rate parameters. The EM runs until the difference between the average estimates in the last two batches is smaller than tol, or until max.iter is reached.

max.lambda

an optional upper bound on the value of the rate parameters. Defaults to 1e6

neighborhood.dist

an integer value indicating the Hamming distance between the observation and the samples generated by "backward" sampling. This option is used if sampling is set to "backward". Defaults to 1

thrds

number of threads for parallel execution

verbose

an optional argument indicating whether to output logging information

seed

seed for reproducibility


cbg-ethz/MC-CBN documentation built on Dec. 15, 2022, 5:42 p.m.