| fullfit | R Documentation |
Executes a comprehensive regression analysis pipeline that combines univariable screening, automatic/manual variable selection, and multivariable modeling in a single function call. This function is designed to streamline the complete analytical workflow from initial exploration to final adjusted models, with publication-ready formatted output showing both univariable and multivariable results side-by-side if desired.
fullfit(
data,
outcome,
predictors,
method = "screen",
multi_predictors = NULL,
p_threshold = 0.05,
columns = "both",
model_type = "glm",
family = "binomial",
random = NULL,
conf_level = 0.95,
reference_rows = TRUE,
show_n = TRUE,
show_events = TRUE,
digits = 2,
p_digits = 3,
labels = NULL,
metrics = "both",
return_type = "table",
keep_models = FALSE,
exponentiate = NULL,
conf_method = NULL,
parallel = TRUE,
n_cores = NULL,
number_format = NULL,
verbose = NULL,
...
)
data |
Data frame or data.table containing the analysis dataset. The function automatically converts data frames to data.tables for efficient processing. |
outcome |
Character string specifying the outcome variable name. For
time-to-event analysis, use |
predictors |
Character vector of predictor variable names to analyze.
All predictors are tested in univariable models. The subset included in
the multivariable model depends on the |
method |
Character string specifying the variable selection strategy:
|
multi_predictors |
Character vector of predictors to include in the
multivariable model when |
p_threshold |
Numeric p-value threshold for automatic variable
selection when |
columns |
Character string specifying which result columns to display:
|
model_type |
Character string specifying the regression model type:
|
family |
For GLM and GLMER models, specifies the error distribution and link function. Can be a character string, a family function, or a family object. Ignored for non-GLM/GLMER models. Binary/Binomial outcomes:
Count outcomes:
Continuous outcomes:
Positive continuous outcomes:
For negative binomial regression (overdispersed counts), use
See |
random |
Character string specifying the random-effects formula for
mixed-effects models ( |
conf_level |
Numeric confidence level for confidence intervals. Must be between 0 and 1. Default is 0.95 (95% CI). |
reference_rows |
Logical. If |
show_n |
Logical. If |
show_events |
Logical. If |
digits |
Integer specifying decimal places for effect estimates. Default is 2. |
p_digits |
Integer specifying the number of decimal places for
p-values. Values smaller than |
labels |
Named character vector or list providing custom display
labels for variables. Names should match variable names, values are
display labels. Default is |
metrics |
Character specification for which statistics to display:
Can also be a character vector: |
return_type |
Character string specifying what to return:
|
keep_models |
Logical. If |
exponentiate |
Logical. Whether to exponentiate coefficients. Default
is |
conf_method |
Character string controlling the confidence interval method.
If
Cox and mixed-effects models use Wald intervals regardless of this setting.
Set globally with |
parallel |
Logical. If |
n_cores |
Integer specifying the number of CPU cores to use for
parallel processing. Default is |
number_format |
Character string or two-element character vector controlling thousand and decimal separators in formatted output. Named presets:
Or provide a custom two-element vector When
options(summata.number_format = "eu")
|
verbose |
Logical. If |
... |
Additional arguments passed to model fitting functions (e.g.,
|
Analysis Workflow:
The function implements a complete regression analysis pipeline:
Univariable screening: Fits separate models for each predictor (outcome ~ predictor). Each predictor is tested independently to understand crude associations.
Variable selection: Based on the method parameter:
"screen": Automatically selects predictors with univariable
p \le p_threshold
"all": Includes all predictors (no selection)
"custom": Uses predictors specified in multi_predictors
Multivariable modeling: Fits a single model with selected predictors (outcome ~ predictor1 + predictor2 + ...). Estimates are adjusted for all other variables in the model.
Output formatting: Combines results into publication-ready table with appropriate effect measures and formatting.
Variable Selection Strategies:
"Screen" Method (method = "screen"):
Uses p-value threshold for automatic selection
Liberal thresholds (e.g., 0.20) cast a wide net to avoid missing important predictors
Stricter thresholds (e.g., 0.05) focus on strongly associated predictors
Helps reduce overfitting and multicollinearity
Common in exploratory analyses and when sample size is limited
"All" Method (method = "all"):
No variable selection - includes all predictors
Appropriate when all variables are theoretically important
Risk of overfitting with many predictors relative to sample size
Useful for confirmatory analyses with pre-specified models
"Custom" Method (method = "custom"):
Manual selection based on subject matter knowledge
Runs univariable analysis for all predictors (for comparison)
Includes only specified predictors in multivariable model
Ideal for theory-driven model building
Allows comparison of unadjusted vs adjusted effects for all variables
Interpreting Results:
When columns = "both" (default), tables show:
Univariable columns: Crude associations, unadjusted for other variables. Labeled as "OR/HR/RR/Coefficient (95% CI)" and "Uni p"
Multivariable columns: Adjusted associations, accounting for all other predictors in the model. Labeled as "aOR/aHR/aRR/Adj. Coefficient (95% CI)" and "Multi p" ("a" = adjusted)
Variables not meeting selection criteria show "-" in multivariable columns
Comparing univariable and multivariable results helps identify:
Confounding: Large changes in effect estimates
Independent effects: Similar univariable and multivariable estimates
Mediation: Attenuated effects in multivariable model
Suppression: Effects that emerge only after adjustment
Sample Size Considerations:
Rule of thumb for multivariable models:
Logistic regression: \ge 10 events per predictor variable
Cox regression: \ge 10 events per predictor variable
Linear regression: \ge 10-20 observations per predictor
Use screening methods to reduce predictor count when these ratios are not met.
Depends on return_type parameter:
When return_type = "table" (default): A data.table with S3 class
"fullfit_result" containing:
Character. Predictor name or custom label
Character. Category level for factors, empty for continuous
Integer. Sample sizes (if show_n = TRUE). For
variables included in the multivariable model, reflects the
complete-case sample size from the fitted model (listwise deletion
across all included predictors). For variables not selected into the
multivariable model, reflects the per-variable sample size from the
univariable analysis. This follows STROBE guideline item 12,
which recommends reporting the number of participants included at
each stage of analysis.
Integer. Event counts (if show_events = TRUE).
Same complete-case convention as n: multivariable rows show
events from the fitted model, univariable-only rows show
per-variable counts.
Character. Unadjusted effect
(if columns includes "uni" and metrics includes "effect")
Character. Univariable p-value (if columns includes
"uni" and metrics includes "p")
Character. Adjusted effect
(if columns includes "multi" and metrics includes "effect")
Character. Multivariable p-value (if columns
includes "multi" and metrics includes "p")
When return_type = "model": The fitted multivariable model object
(glm, lm, coxph, etc.).
When return_type = "both": A list with two elements:
The formatted results data.table
The fitted multivariable model object
The table includes the following attributes:
Character. The outcome variable name
Character. The regression model type
Character. The variable selection method used
Character. Which columns were displayed
The multivariable model object (if fitted)
The complete univariable screening results
Integer. Number of predictors in multivariable model
Character vector. Names of predictors that passed univariable screening at the specified p-value threshold
Character vector. Names of variables with p < 0.05 in the multivariable model (or univariable if multivariable was not fitted)
uniscreen for univariable screening only,
fit for fitting a single multivariable model,
compfit for comparing multiple models,
desctable for descriptive statistics
Other regression functions:
compfit(),
fit(),
multifit(),
print.compfit_result(),
print.fit_result(),
print.fullfit_result(),
print.multifit_result(),
print.uniscreen_result(),
uniscreen()
# Load example data
data(clintrial)
data(clintrial_labels)
# Example 1: Basic screening with p < 0.05 threshold
result1 <- fullfit(
data = clintrial,
outcome = "os_status",
predictors = c("age", "sex", "bmi", "smoking",
"hypertension", "diabetes",
"treatment", "stage"),
method = "screen",
p_threshold = 0.05,
labels = clintrial_labels
)
print(result1)
# Shows both univariable and multivariable results
# Only significant univariable predictors in multivariable model
# Example 2: Include all predictors (no selection)
result2 <- fullfit(
data = clintrial,
outcome = "os_status",
predictors = c("age", "sex", "treatment", "stage"),
method = "all",
labels = clintrial_labels
)
print(result2)
# Example 3: Custom variable selection
result3 <- fullfit(
data = clintrial,
outcome = "os_status",
predictors = c("age", "sex", "bmi", "smoking", "treatment", "stage"),
method = "custom",
multi_predictors = c("age", "treatment", "stage"),
labels = clintrial_labels
)
print(result3)
# Univariable for all, multivariable for selected only
# Example 4: Cox regression with screening
library(survival)
cox_result <- fullfit(
data = clintrial,
outcome = "Surv(os_months, os_status)",
predictors = c("age", "sex", "treatment", "stage"),
model_type = "coxph",
method = "screen",
p_threshold = 0.10,
labels = clintrial_labels
)
print(cox_result)
# Example 5: Linear regression without screening
linear_result <- fullfit(
data = clintrial,
outcome = "bmi",
predictors = c("age", "sex", "smoking", "creatinine"),
model_type = "lm",
method = "all",
labels = clintrial_labels
)
print(linear_result)
# Example 6: Poisson regression for count outcomes
poisson_result <- fullfit(
data = clintrial,
outcome = "fu_count",
predictors = c("age", "stage", "treatment", "surgery"),
model_type = "glm",
family = "poisson",
method = "all",
labels = clintrial_labels
)
print(poisson_result)
# Example 7: Show only multivariable results
multi_only <- fullfit(
data = clintrial,
outcome = "os_status",
predictors = c("age", "sex", "treatment", "stage"),
method = "all",
columns = "multi",
labels = clintrial_labels
)
print(multi_only)
# Example 8: Return both table and model object
both <- fullfit(
data = clintrial,
outcome = "os_status",
predictors = c("age", "sex", "treatment", "stage"),
method = "all",
return_type = "both"
)
print(both$table)
summary(both$model)
# Example 9: Keep univariable models for diagnostics
with_models <- fullfit(
data = clintrial,
outcome = "os_status",
predictors = c("age", "bmi", "creatinine"),
keep_models = TRUE
)
uni_results <- attr(with_models, "uni_results")
uni_models <- attr(uni_results, "models")
summary(uni_models[["age"]])
# Example 10: Linear mixed effects with site clustering
if (requireNamespace("lme4", quietly = TRUE)) {
lmer_result <- fullfit(
data = clintrial,
outcome = "los_days",
predictors = c("age", "treatment", "surgery", "stage"),
random = "(1|site)",
model_type = "lmer",
method = "all",
labels = clintrial_labels
)
print(lmer_result)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.