Summary of Regression Models as HTML Table"

knitr::opts_chunk$set(collapse = TRUE, comment = "#>", warning = FALSE, message = FALSE)

if (!requireNamespace("sjlabelled", quietly = TRUE) ||
    !requireNamespace("sjmisc", quietly = TRUE) ||
    !requireNamespace("lme4", quietly = TRUE) ||
    !requireNamespace("pscl", quietly = TRUE) ||
    !requireNamespace("glmmTMB", quietly = TRUE)) {
  knitr::opts_chunk$set(eval = FALSE)
} else {
  knitr::opts_chunk$set(eval = TRUE)
  library(sjPlot)
}

tab_model() is the pendant to plot_model(), however, instead of creating plots, tab_model() creates HTML-tables that will be displayed either in your IDE's viewer-pane, in a web browser or in a knitr-markdown-document (like this vignette).

HTML is the only output-format, you can't (directly) create a LaTex or PDF output from tab_model() and related table-functions. However, it is possible to easily export the tables into Microsoft Word or Libre Office Writer.

This vignette shows how to create table from regression models with tab_model(). There's a dedicated vignette that demonstrate how to change the table layout and appearance with CSS.

Note! Due to the custom CSS, the layout of the table inside a knitr-document differs from the output in the viewer-pane and web browser!

# load package
library(sjPlot)
library(sjmisc)
library(sjlabelled)

# sample data
data("efc")
efc <- as_factor(efc, c161sex, c172code)

A simple HTML table from regression results

First, we fit two linear models to demonstrate the tab_model()-function.

m1 <- lm(barthtot ~ c160age + c12hour + c161sex + c172code, data = efc)
m2 <- lm(neg_c_7 ~ c160age + c12hour + c161sex + e17age, data = efc)

The simplest way of producing the table output is by passing the fitted model as parameter. By default, estimates, confidence intervals (CI) and p-values (p) are reported. As summary, the numbers of observations as well as the R-squared values are shown.

tab_model(m1)

Automatic labelling

As the sjPlot-packages features labelled data, the coefficients in the table are already labelled in this example. The name of the dependent variable(s) is used as main column header for each model. For non-labelled data, the coefficient names are shown.

data(mtcars)
m.mtcars <- lm(mpg ~ cyl + hp + wt, data = mtcars)
tab_model(m.mtcars)

If factors are involved and auto.label = TRUE, "pretty" parameters names are used (see format_parameters().

set.seed(2)
dat <- data.frame(
  y = runif(100, 0, 100),
  drug = as.factor(sample(c("nonsense", "useful", "placebo"), 100, TRUE)),
  group = as.factor(sample(c("control", "treatment"), 100, TRUE))
)

pretty_names <- lm(y ~ drug * group, data = dat)
tab_model(pretty_names)

Turn off automatic labelling

To turn off automatic labelling, use auto.label = FALSE, or provide an empty character vector for pred.labels and dv.labels.

tab_model(m1, auto.label = FALSE)

Same for models with non-labelled data and factors.

tab_model(pretty_names, auto.label = FALSE)

More than one model

tab_model() can print multiple models at once, which are then printed side-by-side. Identical coefficients are matched in a row.

tab_model(m1, m2)

Generalized linear models

For generalized linear models, the ouput is slightly adapted. Instead of Estimates, the column is named Odds Ratios, Incidence Rate Ratios etc., depending on the model. The coefficients are in this case automatically converted (exponentiated). Furthermore, pseudo R-squared statistics are shown in the summary.

m3 <- glm(
  tot_sc_e ~ c160age + c12hour + c161sex + c172code, 
  data = efc,
  family = poisson(link = "log")
)

efc$neg_c_7d <- ifelse(efc$neg_c_7 < median(efc$neg_c_7, na.rm = TRUE), 0, 1)
m4 <- glm(
  neg_c_7d ~ c161sex + barthtot + c172code,
  data = efc,
  family = binomial(link = "logit")
)

tab_model(m3, m4)

Untransformed estimates on the linear scale

To plot the estimates on the linear scale, use transform = NULL.

tab_model(m3, m4, transform = NULL, auto.label = FALSE)

More complex models

Other models, like hurdle- or zero-inflated models, also work with tab_model(). In this case, the zero inflation model is indicated in the table. Use show.zeroinf = FALSE to hide this part from the table.

library(pscl)
data("bioChemists")
m5 <- zeroinfl(art ~ fem + mar + kid5 + ment | kid5 + phd + ment, data = bioChemists)

tab_model(m5)

You can combine any model in one table.

tab_model(m1, m3, m5, auto.label = FALSE, show.ci = FALSE)

Show or hide further columns

tab_model() has some argument that allow to show or hide specific columns from the output:

Adding columns

In the following example, standard errors, standardized coefficients and test statistics are also shown.

tab_model(m1, show.se = TRUE, show.std = TRUE, show.stat = TRUE)

Removing columns

In the following example, default columns are removed.

tab_model(m3, m4, show.ci = FALSE, show.p = FALSE, auto.label = FALSE)

Removing and sorting columns

Another way to remove columns, which also allows to reorder the columns, is the col.order-argument. This is a character vector, where each element indicates a column in the output. The value "est", for instance, indicates the estimates, while "std.est" is the column for standardized estimates and so on.

By default, col.order contains all possible columns. All columns that should shown (see previous tables, for example using show.se = TRUE to show standard errors, or show.st = TRUE to show standardized estimates) are then printed by default. Colums that are excluded from col.order are not shown, no matter if the show*-arguments are TRUE or FALSE. So if show.se = TRUE, butcol.order does not contain the element "se", standard errors are not shown. On the other hand, if show.est = FALSE, but col.order does include the element "est", the columns with estimates are not shown.

In summary, col.order can be used to exclude columns from the table and to change the order of colums.

tab_model(
  m1, show.se = TRUE, show.std = TRUE, show.stat = TRUE,
  col.order = c("p", "stat", "est", "std.se", "se", "std.est")
)

Collapsing columns

With collapse.ci and collapse.se, the columns for confidence intervals and standard errors can be collapsed into one column together with the estimates. Sometimes this table layout is required.

tab_model(m1, collapse.ci = TRUE)

Defining own labels

There are different options to change the labels of the column headers or coefficients, e.g. with:

tab_model(
  m1, m2, 
  pred.labels = c("Intercept", "Age (Carer)", "Hours per Week", "Gender (Carer)",
                  "Education: middle (Carer)", "Education: high (Carer)", 
                  "Age (Older Person)"),
  dv.labels = c("First Model", "M2"),
  string.pred = "Coeffcient",
  string.ci = "Conf. Int (95%)",
  string.p = "P-Value"
)

Including reference level of categorical predictors

By default, for categorical predictors, the variable names and the categories for regression coefficients are shown in the table output.

library(glmmTMB)
data("Salamanders")
model <- glm(
  count ~ spp + Wtemp + mined + cover,
  family = poisson(),
  data = Salamanders
)

tab_model(model)

You can include the reference level for categorical predictors by setting show.reflvl = TRUE.

tab_model(model, show.reflvl = TRUE)

To show variable names, categories and include the reference level, also set prefix.labels = "varname".

tab_model(model, show.reflvl = TRUE, prefix.labels = "varname")

Style of p-values

You can change the style of how p-values are displayed with the argument p.style. With p.style = "stars", the p-values are indicated as * in the table.

tab_model(m1, m2, p.style = "stars")

Another option would be scientific notation, using p.style = "scientific", which also can be combined with digits.p.

tab_model(m1, m2, p.style = "scientific", digits.p = 2)

Automatic matching for named vectors

Another way to easily assign labels are named vectors. In this case, it doesn't matter if pred.labels has more labels than coefficients in the model(s), or in which order the labels are passed to tab_model(). The only requirement is that the labels' names equal the coefficients names as they appear in the summary()-output.

# example, coefficients are "c161sex2" or "c172code3"
summary(m1)

pl <- c(
  `(Intercept)` = "Intercept",
  e17age = "Age (Older Person)",
  c160age = "Age (Carer)", 
  c12hour = "Hours per Week", 
  barthtot = "Barthel-Index",
  c161sex2 = "Gender (Carer)",
  c172code2 = "Education: middle (Carer)", 
  c172code3 = "Education: high (Carer)",
  a_non_used_label = "We don't care"
)

tab_model(
  m1, m2, m3, m4, 
  pred.labels = pl, 
  dv.labels = c("Model1", "Model2", "Model3", "Model4"),
  show.ci = FALSE, 
  show.p = FALSE, 
  transform = NULL
)

Keep or remove coefficients from the table

Using the terms- or rm.terms-argument allows us to explicitly show or remove specific coefficients from the table output.

tab_model(m1, terms = c("c160age", "c12hour"))

Note that the names of terms to keep or remove should match the coefficients names. For categorical predictors, one example would be:

tab_model(m1, rm.terms = c("c172code2", "c161sex2"))


Try the sjPlot package in your browser

Any scripts or data that you put into this service are public.

sjPlot documentation built on Aug. 17, 2023, 5:11 p.m.