knitr::opts_chunk$set(echo = TRUE)
summary.factorlist()
is a simple wrapper used to summarise any number of variables by a single categorical variable.
This is usually "Table 1" of a study report.
library(summarizer) library(dplyr) library(stringr) # Load example dataset, modified version of survival::colon data(colon_s) # Table 1 - Patient demographics ---- explanatory = c("age", "age.factor", "sex.factor", "obstruct.factor") dependent = "perfor.factor" colon_s %>% summary.factorlist(dependent, explanatory, p=T)
summary.factorlist()
is also commonly used to summarise any number of variables by an outcome variable (say dead yes/no).
# Table 2 - 5 yr mortality ---- explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor") dependent = 'mort_5yr' colon_s %>% summary.factorlist(dependent, explanatory)
The second main feature is the ability to create final tables for logistic glm()
, hierarchical logistic lme4::glmer()
and
Cox proprotional hazard survival::coxph()
regression models.
The summarizer()
"all-in-one" function takes a single dependent variable with a vector of explanatory variable names
(continuous or categorical variables) to produce a final table for publication including summary statistics,
univariable and multivariable regression analyses. The first columns are those produced by
summary.factorist()
.
glm(depdendent ~ explanatory, family="binomial")
explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor") dependent = 'mort_5yr' colon_s %>% summarizer(dependent, explanatory)
Where a multivariable model contains a subset of the variables specified in the full univariable set, this can be specified.
explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor") explanatory.multi = c("age.factor", "obstruct.factor") dependent = 'mort_5yr' colon_s %>% summarizer(dependent, explanatory, explanatory.multi)
lme4::glmer(dependent ~ explanatory + (1 | random_effect), family="binomial")
explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor") explanatory.multi = c("age.factor", "obstruct.factor") random.effect = "hospital" dependent = 'mort_5yr' colon_s %>% summarizer(dependent, explanatory, explanatory.multi, random.effect)
metrics=TRUE
provides common model metrics.
note - defaults to data.frame print out - kable doesn't handle list automatically
colon_s %>% summarizer(dependent, explanatory, explanatory.multi, metrics=TRUE)
survival::coxph(dependent ~ explanatory)
explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor") dependent = "Surv(time, status)" colon_s %>% summarizer(dependent, explanatory)
Rather than going all-in-one, any number of subset models can be manually added on to a summary.factorlist()
table using summarizer.merge()
. This is particularly useful when models take a long-time to run or are complicated.
Note requirement for glm.id=TRUE
. fit2df
is a subfunction extracting most common models to a dataframe.
explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor") explanatory.multi = c("age.factor", "obstruct.factor") random.effect = "hospital" dependent = 'mort_5yr' # Separate tables colon_s %>% summary.factorlist(dependent, explanatory, glm.id=TRUE) -> example.summary colon_s %>% glmuni(dependent, explanatory) %>% fit2df(estimate.suffix=" (univariable)") -> example.univariable colon_s %>% glmmulti(dependent, explanatory) %>% fit2df(estimate.suffix=" (multivariable)") -> example.multivariable colon_s %>% glmmixed(dependent, explanatory, random.effect) %>% fit2df(estimate.suffix=" (multilevel") -> example.multilevel # Pipe together example.summary %>% summarizer.merge(example.univariable) %>% summarizer.merge(example.multivariable) %>% summarizer.merge(example.multilevel) %>% select(-c(glm.id, index)) -> example.final example.final
explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor") explanatory.multi = c("age.factor", "obstruct.factor") dependent = "Surv(time, status)" # Separate tables colon_s %>% summary.factorlist(dependent, explanatory, glm.id=TRUE) -> example2.summary colon_s %>% coxphuni(dependent, explanatory) %>% fit2df(estimate.suffix=" (univariable)") -> example2.univariable colon_s %>% coxphmulti(dependent, explanatory.multi) %>% fit2df(estimate.suffix=" (multivariable)") -> example2.multivariable # Pipe together example2.summary %>% summarizer.merge(example2.univariable) %>% summarizer.merge(example2.multivariable) %>% select(-c(glm.id, index)) -> example2.final example2.final
Models can be summarized with odds ratio/hazard ratio plots using or.plot
or hr.plot
(hr.plot not fully tested).
# OR plot explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor") dependent = 'mort_5yr' colon_s %>% or.plot(dependent, explanatory) # Previously fitted models (`glmmulti`) can be provided directly to `glmfit` # HR plot (not fully tested) explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor") dependent = "Surv(time, status)" colon_s %>% hr.plot(dependent, explanatory, dependent_label = "Survival") # Previously fitted models (`coxphmulti`) can be provided directly using `coxfit`
Our own particular Rstan
models are supported and will be documented in the future. Broadly, if you are running (hierarchical) logistic regression models in Stan with coefficients specified as a vector labelled beta
, then fit2df()
will work directly on the stanfit
object in a similar manner to if it was a glm
or glmerMod
object.
Use Hmisc::label()
to assign labels to variables for tables and plots.
label(colon_s$age.factor) = "Age (years)"
Export dataframe tables directly or to R Markdown using knitr::kable()
.
Note wrapper summary.missing()
can be useful. Wraps mice::md.pattern
.
colon_s %>% summary.missing(dependent, explanatory)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.