linear_re: Main Function for fitting the random effect linear model
In pprof: Modeling, Standardization and Testing for Provider Profiling

linear_re

R Documentation

Main Function for fitting the random effect linear model

Description

Fit a random effect linear model via lmer from the lme4 package.

Usage

linear_re(
  formula = NULL,
  data = NULL,
  Y = NULL,
  Z = NULL,
  ID = NULL,
  Y.char = NULL,
  Z.char = NULL,
  ID.char = NULL,
  ...
)

Arguments

`formula`	a two-sided formula object describing the model to be fitted, with the response variable on the left of a ~ operator and covariates on the right, separated by + operators. The random effect of the provider identifier is specified using `(1 \| )`.
`data`	a data frame containing the variables named in the `formula`, or the columns specified by `Y.char`, `Z.char`, and `ID.char`.
`Y`	a numeric vector representing the response variable.
`Z`	a matrix or data frame representing the covariates, which can include both numeric and categorical variables.
`ID`	a numeric vector representing the provider identifier.
`Y.char`	a character string specifying the column name of the response variable in the `data`.
`Z.char`	a character vector specifying the column names of the covariates in the `data`.
`ID.char`	a character string specifying the column name of the provider identifier in the `data`.
`...`	additional arguments passed to `lmer` for further customization.

Details

This function is used to fit a random effect linear model of the form:

Y_{ij} = \mu + \alpha_i + \mathbf{Z}_{ij}^\top\boldsymbol\beta + \epsilon_{ij}

where Y_{ij} is the continuous outcome for individual j in provider i, \mu is the overall intercept, \alpha_i is the random effect for provider i, \mathbf{Z}_{ij} are the covariates, and \boldsymbol\beta is the vector of coefficients for the covariates.

The model is fitted by overloading the lmer function from the lme4 package. Three different input formats are accepted: a formula and dataset, where the formula is of the form response ~ covariates + (1 | provider), with provider representing the provider identifier; a dataset along with the column names of the response, covariates, and provider identifier; or the outcome vector \boldsymbol{Y}, the covariate matrix or data frame \mathbf{Z}, and the provider identifier vector.

In addition to these input formats, all arguments from the lmer function can be modified via ..., allowing for customization of model fitting options such as controlling the optimization method or adjusting convergence criteria. By default, the model is fitted using REML (restricted maximum likelihood).

If issues arise during model fitting, consider using the data_check function to perform a data quality check, which can help identify missing values, low variation in covariates, high-pairwise correlation, and multicollinearity. For datasets with missing values, this function automatically removes observations (rows) with any missing values before fitting the model.

Value

A list of objects with S3 class "random_re":

`coefficient`	a list containing the estimated coefficients: `FE`, the fixed effects for each predictor and the intercept, and `RE`, the random effects for each provider.
`variance`	a list containing the variance estimates: `FE`, the variance-covariance matrix of the fixed effect coefficients, and `RE`, the variance of the random effects.
`sigma`	the residual standard error.
`fitted`	the fitted values of each individual.
`observation`	the original response of each individual.
`residuals`	the residuals of each individual, that is response minus fitted values.
`linear_pred`	the linear predictor of each individual.
`data_include`	the data used to fit the model, sorted by the provider identifier. For categorical covariates, this includes the dummy variables created for all categories except the reference level.
`char_list`	a list of the character vectors representing the column names for the response variable, covariates, and provider identifier. For categorical variables, the names reflect the dummy variables created for each category.
`Loglkd`	the log-likelihood.
`AIC`	Akaike information criterion.
`BIC`	Bayesian information criterion.

References

Bates D, Maechler M, Bolker B, Walker S (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48.

Examples

data(ExampleDataLinear)
outcome <- ExampleDataLinear$Y
covar <- ExampleDataLinear$Z
ID <- ExampleDataLinear$ID
data <- data.frame(outcome, ID, covar)
covar.char <- colnames(covar)
outcome.char <- colnames(data)[1]
ID.char <- colnames(data)[2]
formula <- as.formula(paste("outcome ~", paste(covar.char, collapse = " + "), "+ (1|ID)"))

# Fit random effect linear model using three input formats
fit_re1 <- linear_re(Y = outcome, Z = covar, ID = ID)
fit_re2 <- linear_re(data = data, Y.char = outcome.char, Z.char = covar.char, ID.char = ID.char)
fit_re3 <- linear_re(formula, data)

pprof documentation built on April 12, 2025, 1:33 a.m.