Investigating specific specifications
In specr: Conducting and Visualizing Specification Curve Analyses

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  #fig.path = "man/figures/README-",
  out.width = "100%",
  fig.retina = 2
)

Per default, specr() summarizes individual specifications by using broom::tidy() and broom::glance(). For most cases, this provides a sufficient and appropriate summary of the relationship of interest and model characteristics. Sometimes, however, it might be useful to investigate specific models in more detail or to investigate a specific parameter that is not provided by the two functions (e.g., r-square). This vignette shows how to access individual models and extract further information from them.

library(tidyverse)
library(specr)
library(performance)

Setup specifications with a specific extract function

If we want to investigate individual models and particularly all aspects of that model, we need to create a custom extract function that also stores the entire model object in the result data frame.

# Custom function
tidy_new <- function(x) {
  fit <- broom::tidy(x, conf.int = TRUE)
  fit$res <- list(x)  # Store model object
  return(fit)
}

# Run specification curve analysis
specs <- setup(data = example_data, 
               y = c("y1", "y2"), 
               x = c("x1", "x2"), 
               model = c("lm"),
               controls = c("c1", "c2"),
               subsets = list(group1 = unique(example_data$group1),  
                              group2 = unique(example_data$group2)),
               fun1 = tidy_new)

results <- specr(specs)

Identify model(s) of interest

For this example, we are going to look at two specific models (same independent variables, all controls, all participants, but different dependent variables).

(y_models <- results %>%
  as_tibble %>%
  filter(x == "x1", 
         controls == "c1 + c2",
         subsets == "all")) %>%
  select(x:group2, estimate:res)

As you can see, the resulting tibble includes an additional column called res. This column includes the entire "model object" and we can use it to further investigate each model.

Investigate models

For example, we can now easily get a full summary of the models and compare individual coefficients and statistics.

y_models %>%
  pull(res) %>%
  map(summary) %>%
  map(coef)

Or we could get r-squared values for both models (here using the function r2() from the performance package).

y_models %>%
  pull(res) %>% 
  map(r2)       # r2 is include in the package "performance"

Some more examples

This way, we can analyze or compare such statistics across several models.

r2_results <- results %>%
  as_tibble %>%
  filter(subsets == "all") %>%
  mutate(r2 = map(res, r2), 
         r2 = map_dbl(r2, 1)) %>%
  arrange(r2)

r2_results %>%
  select(x:controls, r2)

And we can plot comparisons...

r2_results %>%
  arrange(r2) %>%
  mutate(rank = 1:n()) %>%
  ggplot(aes(x = rank, 
             y = r2)) +
  geom_point() +
  geom_line() +
  theme_minimal() +
  theme(strip.text = element_blank(),
        axis.line = element_line("black", size = .5),
        axis.text = element_text(colour = "black"))