knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.align = 'center'
)

Consider the asia dataset available from bnlearn. Details of the variables can be found in the bnlearn documentation.

library(bnlearn)
data(asia)
summary(asia)

Consider two candidate BN models (Lauritzen and Spiegelhalter, 1988), which only differ in the edge from S to D. ``` {r out.width="50%"} asia_bn <- bnlearn::model2network("[A][S][T|A][L|S][B|S][D|B:E][E|T:L][X|E]") bnlearn::graphviz.plot(asia_bn) asia_bn_alt <- bnlearn::model2network("[A][S][T|A][L|S][B|S][E|T:L][X|E][D|B:E:S]") bnlearn::graphviz.plot(asia_bn_alt)

### Global monitor

<!-- The global monitor is equivalent to the Bayes Factor and offers a good assessment of the model as a whole.  -->
<!-- It is primarily useful for differentiating between candidate models as the global monitor pinpoints the nodes with different contributions for two candidate models. -->
<!-- We can assess the log likelihood contribution from each of the nodes and determine which nodes contribute the most to the global monitor. -->
<!-- global.monitor.graph provides a quick visual for the entire network.  -->
<!-- The darker the color, the more substantial the contribution of that node to the log likelihood. -->

<!-- The global monitor assesses the probability of observing the data given the model and as such does not particularly fit the prequential view.  -->
<!-- However, it is a quick and useful first diagnostic assessment. -->

A first useful diagnostic is the `global_monitor`, reporting the contribution of each vertex to the log-likelihood of the model.

``` {r}
library(bnmonitor)
asia <- asia[sample(1:nrow(asia),50),]
glob_asia <- node_monitor(dag = asia_bn, df = asia, alpha = 3)
glob_asia_alt <- node_monitor(dag = asia_bn_alt, df = asia, alpha = 3)
glob_asia
glob_asia_alt

In the alternative model, Dysnopea contributes more to the log-likelihood.

Global monitors can be plotted giving a quick view of the decomposition of the log-likelihood. The darker the color, the more substantial the contribution of a vertex.

plot(glob_asia)

Node monitor

There are two variants of node monitors.

As a quick survey of the nodes, the node_monitor command computes the marginal and conditional monitors for the final observation in the data set.

``` {r echo=TRUE, message = FALSE, warning = FALSE} node_asia <- final_node_monitor(dag = asia_bn, df = asia) node_asia

The scores indicate a poor fit of the probability distributions specified for the Smoking, Bronchitis, and Dysnopea nodes, since these are larger than 1.96 in absolute value. Plots can also be created to give a visual counterpart of the node monitors.

```r
plot(node_asia, which = "marginal")

As an illustration we investigate further the fit of the variable Dysnopea.

The sequential marginal monitor seq_marg_monitor gives us a closer look at which particular forecasts in the data set might cause this poor fit. We examine the sequential monitor for both candidate models.

``` {r warning=FALSE} seq_asia <- seq_marg_monitor(dag = asia_bn, df = asia, node.name = "D") seq_asia_alt <- seq_marg_monitor(dag = asia_bn_alt, df = asia, node.name = "D") seq_asia seq_asia_alt

```r
plot(seq_asia)
plot(seq_asia_alt)

Both monitors indicate that for some observations there is a poor fit (score above 1.96 in absolute value). In particular for the alternative models the marginal monitor has larger values in absolute value.

A similar analysis can be conducted with seq_marg_monitor, which would show that the model fits well (not reported here).

Parent Child monitor

Once a vertex has been identified as a poor fit, further investigation can be carried out to check for which values of the parents the model provides a bad representation. This can be achieved with the seq_pa_ch_monitor function.

As an illustration consider the asia_bn BN, the vertex D (Dysnopea), the parent variable B (Bronchitis) which can take values yes and no.

``` {r warning=FALSE, out.width="50%"} asia_pa_ch1 <- seq_pa_ch_monitor(dag = asia_bn, df = asia, node.name = "D", pa.names = "B", pa.val = "yes", alpha = 3) asia_pa_ch1 asia_pa_ch2 <- seq_pa_ch_monitor(dag = asia_bn, df = asia, node.name = "D", pa.names = "B", pa.val = "no", alpha = 3) asia_pa_ch2 plot(asia_pa_ch1) plot(asia_pa_ch2)

For this model, Dysnopea is adequately modeled for both values of Bronchitis, since most scores largely fall in the recommended range.

### Influential observations

The last robustness tool is the absolute value of the log-likelihood ratio between a model learnt without one observation and the one learnt with the full dataset. Larger values are associated to atomic events which influence the structural learning. 


```r
influence <- influential_obs(dag = asia_bn, data = asia, alpha = 3)
head(influence)
plot(influence)


manueleleonelli/bnmonitor documentation built on Oct. 10, 2024, 3:04 a.m.