forward_greedy: A forward greedy search function

Description Usage Arguments Value

View source: R/forward_greedy.R

Description

This is a forward greedy seearch function that can be used incorprate with an objective function mml cpt, logit, or naive bayes (adaptive code) to discovery Markov blanket candidates. It is a greedy search, so the function will stop if there is no better option to add into the current Markov blanket.

Usage

1
2
3
forward_greedy(data, arities, vars, sampleSize, target, model, sigma = 3,
  dataNumeric = NULL, varCnt = NULL, prior = "uniform", dag = NULL,
  alpha = 1, statingPara = FALSE, debug = FALSE)

Arguments

data

A dataset whost variables are in numeric/integer format. Any categorical variables must be converted into numeric/integer first.

arities

A vector of variable arities in data, in the same order as the column names of data.

vars

A vector of all variables in data, in the same order as the column names of data.

sampleSize

The sample size. That is, the number of rows of data.

target

The target node, whose Markov blanket we are interested in.

model

The options are cpt, logit (binary only), naive bayes and random models.

sigma

The standard derivation of the assumed Gaussian distribution for parameter prior. The default value is 3 as suggested by the original paper.

dataNumeric

This parameter is for mml_logit. The numeric format of the given data set. Variable values start from 0.

varCnt

This parameter is for mml_cpt. As explained by argument name. It is obtained by getting the detailed information of the given data using the function count_occurance().

prior

A character parameter with options "uniform", "tom" and "bayes" indicate the uniform prior (default), TOM (totally ordered model) and Bayesian prior when averaging the message lengths for random structures. The Bayesian prior starts with the uniform prior then calculates the posteriors and use them as priors for the next step.

dag

The true DAG.

alpha

A vector of concentration parameters for a Dirichlet distribution. Range is from zeor to positive infinity, length is equal to the arity of the target variable.

statingPara

Default is FALSE. If TRUE, then MML estimate of the parameters are also stated with extra 0.5log(pi*e/6) per parameter.

debug

A boolean argument to show the detailed Markov blanket inclusion steps based on each mml score.

Value

The function returns the learned Markov blanket candidates according to the assigned objective function.


kelvinyangli/mbmml documentation built on June 29, 2020, 3:12 a.m.