automatic_fit: Automatic fitting models for Bayesian network.

View source: R/joint_PAF.R

automatic_fitR Documentation

Automatic fitting models for Bayesian network.

Description

Main effects models are fit by default. For continuous variables, lm is used, for binary (numeric 0/1 variables), glm is used and for factor valued variables polr is used. For factors, ensure that the factor levels are ordered by increasing levels of risk. If interactions are required for certain models, it is advisable to populate the elements of model_list separately.

Usage

automatic_fit(
  data,
  parent_list,
  node_vec,
  prev = 0.09,
  common = "",
  spline_nodes = c(),
  df_spline_nodes = 3
)

Arguments

data

Data frame. A data frame containing variables used for fitting the models. Must contain all variables used in fitting

parent_list

A list. The ith element is the vector of variable names that are direct causes of ith variable in node_vec

node_vec

A vector corresponding to the nodes in the Bayesian network. This must be specified from root to leaves - that is ancestors in the causal graph for a particular node are positioned before their descendants. If this condition is false the function will return an error.

prev

Prevalence of disease. Set to NULL for cohort or cross sectional studies

common

character text for part of the model formula that doesn't involve any variable in node_vec. Useful for specifying confounders involved in all models automatically

spline_nodes

Vector of continuous variable names that are fit as splines (when involved as parents). Natural splines are used.

df_spline_nodes

How many degrees of freedom for each spline (Default 3). At the moment, this can not be specified separately for differing variables.

Value

A list of fitted models corresponding to node_vec and parent_vec.

Examples

# More complicated example (slower to run)
library(splines)
parent_exercise <- c("education")
parent_diet <- c("education")
parent_smoking <- c("education")
parent_alcohol <- c("education")
parent_stress <- c("education")
parent_high_blood_pressure <- c("education","exercise","diet",
"smoking","alcohol","stress")
parent_lipids <- c("education","exercise","diet","smoking",
"alcohol","stress")
parent_waist_hip_ratio <- c("education","exercise","diet","smoking",
"alcohol","stress")
parent_early_stage_heart_disease <- c("education","exercise","diet",
"smoking","alcohol","stress","lipids","waist_hip_ratio","high_blood_pressure")
parent_diabetes <- c("education","exercise","diet","smoking","alcohol",
"stress","lipids","waist_hip_ratio","high_blood_pressure")
parent_case <- c("education","exercise","diet","smoking","alcohol","stress",
"lipids","waist_hip_ratio","high_blood_pressure","early_stage_heart_disease","diabetes")
parent_list <- list(parent_exercise,parent_diet,parent_smoking,
parent_alcohol,parent_stress,parent_high_blood_pressure,
parent_lipids,parent_waist_hip_ratio,parent_early_stage_heart_disease,
parent_diabetes,parent_case)
node_vec=c("exercise","diet","smoking","alcohol","stress","high_blood_pressure",
"lipids","waist_hip_ratio","early_stage_heart_disease",
"diabetes","case")

model_list=automatic_fit(data=stroke_reduced, parent_list=parent_list,
node_vec=node_vec, prev=.0035,common="region*ns(age,df=5)+
sex*ns(age,df=5)", spline_nodes = c("waist_hip_ratio","lipids","diet"))


graphPAF documentation built on Sept. 23, 2022, 1:06 a.m.