automatic_fit: Automatic fitting of probability models in a pre-specified...
In graphPAF: Estimating and Displaying Population Attributable Fractions

automatic_fit

R Documentation

Automatic fitting of probability models in a pre-specified Bayesian network.

Description

Main effects models are fit by default. For continuous variables, lm is used, for binary (numeric 0/1 variables), glm is used and for factor valued variables polr is used. For factors, ensure that the factor levels are ordered by increasing levels of risk. If interactions are required for certain models, it is advisable to populate the elements of model_list separately.

Usage

automatic_fit(
  data,
  parent_list,
  node_vec,
  prev = NULL,
  common = "",
  spline_nodes = c(),
  df_spline_nodes = 3
)

Arguments

`data`	Data frame. A data frame containing variables used for fitting the models. Must contain all variables used in fitting
`parent_list`	A list. The ith element is the vector of variable names that are direct causes of ith variable in node_vec
`node_vec`	A vector corresponding to the nodes in the Bayesian network. This must be specified from root to leaves - that is ancestors in the causal graph for a particular node are positioned before their descendants. If this condition is false the function will return an error.
`prev`	Prevalence of disease. Set to NULL for cohort or cross sectional studies
`common`	character text for part of the model formula that doesn't involve any variable in node_vec. Useful for specifying confounders involved in all models automatically
`spline_nodes`	Vector of continuous variable names that are fit as splines (when involved as parents). Natural splines are used.
`df_spline_nodes`	How many degrees of freedom for each spline (Default 3). At the moment, this can not be specified separately for differing variables.

Value

A list of fitted models corresponding to node_vec and parent_vec.

Examples

# More complicated example (slower to run)
library(splines)
parent_exercise <- c("education")
parent_diet <- c("education")
parent_smoking <- c("education")
parent_alcohol <- c("education")
parent_stress <- c("education")
parent_high_blood_pressure <- c("education","exercise","diet",
"smoking","alcohol","stress")
parent_lipids <- c("education","exercise","diet","smoking",
"alcohol","stress")
parent_waist_hip_ratio <- c("education","exercise","diet","smoking",
"alcohol","stress")
parent_early_stage_heart_disease <- c("education","exercise","diet",
"smoking","alcohol","stress","lipids","waist_hip_ratio","high_blood_pressure")
parent_diabetes <- c("education","exercise","diet","smoking","alcohol",
"stress","lipids","waist_hip_ratio","high_blood_pressure")
parent_case <- c("education","exercise","diet","smoking","alcohol","stress",
"lipids","waist_hip_ratio","high_blood_pressure","early_stage_heart_disease","diabetes")
parent_list <- list(parent_exercise,parent_diet,parent_smoking,
parent_alcohol,parent_stress,parent_high_blood_pressure,
parent_lipids,parent_waist_hip_ratio,parent_early_stage_heart_disease,
parent_diabetes,parent_case)
node_vec=c("exercise","diet","smoking","alcohol","stress","high_blood_pressure",
"lipids","waist_hip_ratio","early_stage_heart_disease",
"diabetes","case")

model_list=automatic_fit(data=stroke_reduced, parent_list=parent_list,
node_vec=node_vec, prev=.0035,common="region*ns(age,df=5)+
sex*ns(age,df=5)", spline_nodes = c("waist_hip_ratio","lipids","diet"))

graphPAF documentation built on May 29, 2024, 10:21 a.m.