#' @title Individual Parameter Contribution Regression
#' @description Explain and predict differences in model parameters with individual
#' parameter contribution (IPC) regression. IPC regression allows studying heterogeneity
#' in parameter estimates as a linear function of covariates. \code{ipcr} was
#' mainly written for investigating parameter heterogeneity in structural equation models
#' fitted with \pkg{lavaan} or \pkg{OpenMx} but also for models estimated with
#' \code{\link[stats]{lm}} and \code{\link[stats]{glm}}.
#' @param fit a fitted model object.
#' @param covariates a vector, matrix, or data.frame with one or more covariates used to
#' predict differences in the model parameters. Interaction and polynomial terms can be
#' included as new variables which may require centering.
#' @param iterate a logical value; if TRUE iterated IPC regression is performed.
#' Currently, iterated IPC regression is only available for models fitted with
#' \pkg{lavaan} or \pkg{OpenMx}.
#' @param iteration_info a logical value; if TRUE the parameter values for each iteration
#' with corresponding log-likelihood value are stored in a matrix. Requesting this matrix
#' increases the runtime.
#' @param conv an integer used as a stopping criterion for iterated IPC regression. The
#' criterion is the largest difference in any parameter estimate between iterations.
#' @param max_it the maximum number of iterations for iterated IPC regressions.
#' @param regularization a logical value; if TRUE regularized linear regression models are
#' fitted via penalized maximum likelihood using k-fold cross-validation.
#' @param s a character."lambda.min" (default) gives the minimum mean cross-validated
#' error. The other option is lambda.1se", which gives the most regularized model such
#' that the error is within one standard error of the minimum. For regularized IPC
#' regression only.
#' @param alpha The elastic net mixing parameter with \eqn{0 \le \alpha \le 1}. alpha = 1
#' is the lasso penalty (default) and alpha = 0 the rigde penalty. For regularized IPC
#' regression only.
#' @param weights observation weights for regularization. Can be total counts if responses are
#' proportion matrices. Default is 1 for each observation. For regularized IPC regression
#' only.
#' @param nlambda the number of penalty terms. The default is 100. For regularized IPC
#' regression only.
#' @param standardize a logical value; if TRUE variables are standardized prior to
#' regularization. This only affects regularization; standard/iterated IPC regression
#' coefficients are not standardized.
#' @param nfolds number of folds - default is 10. Although nfolds can be as large as the
#' sample size (leave-one-out cross-validation), it is not recommended for large datasets.
#' Smallest value allowable is 3. For regularized IPC regression only.
#' @param linear_MxModel a logical value indicating if the structural equation model
#' contains non-linear functions of model parameters (FALSE) or not (TRUE, default). TRUE
#' speeds up the runtime of the linear model. Only relevant for models fitted with
#' \pkg{OpenMx}.
#' @details
#' IPCs are rough approximations of individual-specific parameter values. The IPCs of
#' individual \eqn{i} are defined as
#' \deqn{IPC_i=\theta+A(\theta)^{-1}S(\theta,y_i),}
#' where \eqn{\theta} are the estimated model parameters, \eqn{S(\theta,y_i)} is the
#' estimating function (e.g, the first derivative of the log-likelihood), and \eqn{A(\theta)}
#' is the expectation of the negative derivative of the estimating function (often called
#' Hessian matrix). By regressing IPCs on covariates, parameter differences can be
#' predicted.
#'
#' IPCs are often slightly biased. This bias can be removed with a procedure termed
#' iterated IPC regression, which re-calculates the IPCs until the regression coefficients
#' of the IPC regression models converge. Iterated IPCs often show larger variability
#' than standard IPCs. \code{iterate=TRUE} performs iterated IPC regression.
#' \code{regularization=TRUE} adds another list with regularized linear models to the
#' \code{ipcr} object. All arguments related to regularization are passed to
#' \code{\link[glmnet]{cv.glmnet}}. If requested, iterated IPCs will be used for the
#' regularized IPC regression models.
#'
#' @return \code{ipcr} returns an object of \code{\link[base]{class}} "\code{ipcr}". An
#' \code{ipcr} function call returns a list which may consist of the following elements:
#' \tabular{ll}{
#' \code{info} \tab a list with information about the \code{ipcr} function call \cr
#' \code{IPCs} \tab a \code{data.frame} with IPCs \cr
#' \code{regression_list} \tab a list with an (iterated) IPC regression model for each
#' model parameter \cr
#' \code{regularized_regression_list} \tab a list with a regularized (iterated) IPC
#' regression model for each model parameter \cr
#' \code{output} \tab formatted output that can be examined with \code{print} and
#' \code{summary}
#' }
#'
#' The function \code{summary} prints a summary of the IPC regression equations.
#' \code{print} shows the arguments specified in the \code{ipcr} function call.
#' \code{plot} visualizes the correlation between IPCs and covariates in the form of a
#' heatmap.
#'
#' The generic functions \code{AIC}, \code{BIC}, \code{coef}, \code{confint},
#' \code{effects}, \code{fitted}, \code{logLik}, \code{nobs}, \code{predict},
#' \code{residuals}, \code{sigma}, and \code{vcvo} extract various information from the
#' value returned by \code{ipcr}. Heteroskedastic robust IPC regression can be performed
#' with the functions \code{{coeftest}} and \code{coefci} from \pkg{lmtest}. By default,
#' these functions extract information for all model parameters. By specifying an
#' additional \code{parameter} argument with the names of one or several of the model
#' parameters as in \code{coef}, the return values can be limited to the corresponding
#' model parameters.
#'
#' @examples
#' # Structural equation model example using the lavaan package
#'
#' ## Load Holzinger and Swineford (1939) data provided by the lavaan package
#' HS_data <- lavaan::HolzingerSwineford1939
#'
#' ## Remove observations with missing values
#' HS_data <- HS_data[stats::complete.cases(HS_data), ]
#'
#' ## lavaan model syntac for a single group model
#' m <- 'visual =~ x1 + x2 + x3
#' textual =~ x4 + x5 + x6
#' speed =~ x7 + x8 + x9'
#'
#' ## Fit the model
#' fit <- lavaan::cfa(model = m, data = HS_data)
#'
#' ## Prepare a data.frame with covariates
#' covariates <- HS_data[, c("sex", "ageyr", "agemo", "school", "grade")]
#'
#' ## Regress parameters on covariates with the ipcr function
#' res <- ipcr(fit = fit, covariates = covariates)
#'
#' ## Plot heatmap with the correlation between parameters and predictors
#' plot(res)
#'
#' ## Show results (standard IPC regression)
#' summary(res)
#'
#' ## IPC regression with LASSO regularization
#' res_reg <- ipcr(fit = fit, covariates = covariates, regularization = TRUE)
#'
#' ## Show results (regularized standard IPC regression)
#' summary(res_reg)
#' @references
#' Arnold, M., Oberski, D. L., Brandmaier, A. M., & Voelkle, M. C. (2019). Identifying
#' heterogeneity in dynamic panel models with individual parameter contribution
#' regression. \emph{Structural Equation Modeling, 27}, 613-628. doi:
#' \href{https://doi.org/10.1080/10705511.2019.1667240}{10.1080/10705511.2019.1667240}
#' @seealso \code{\link{get_ipcs}}, \code{\link{plot.ipcr}}, and
#' \code{\link{summary.ipcr}}
#' @export
ipcr <- function(fit, covariates = NULL, iterate = FALSE, iteration_info = FALSE,
conv = 0.0001, max_it = 50, regularization = FALSE, s = "lambda.min",
alpha = 1, weights = NULL, nlambda = 100, standardize = TRUE,
nfolds = 10, linear_MxModel = TRUE) {
# Checks --------
## Get model data
#model_data <- get_data(fit)
# This needs to be checked and enabled
#check_ipcr_arguments(fit = fit, iterate = iterate, conv = conv,
# max_it = max_it, linear = linear, model_data = model_data)
## Check covariates and transform into data.frame
### Covariate status:
### 0: everything is fine (required for iterated IPCR)
### 1: missing data in covariates (run standard IPCR with complete observations)
### 2: no covariates found (only compute IPCs)
if (is.null(covariates)) {
covariate_status <- 2
info_covariates <- "no covariates found"
} else {
covariates <- as.data.frame(covariates)
if (all(stats::complete.cases(covariates))) {
covariate_status <- 0
} else {
covariate_status <- 1
warning("Missing data in covariates detected. Standard IPC regression is performed using complete observations (rows).")
}
if (is.null(names(covariates)) | any(is.na(names(covariates)))) {
warning("Some covariate are not named. Renaming all covariates using the order of the data.frame.")
colnames(covariates) <- paste0("covariate", seq_len(NCOL(covariates)))
}
info_covariates <- names(covariates)
}
## Check for definition variables in a OpenMx model
### iterated IPCR is not yet implemented for OpenMx models with definition variables
### Gives a warning and switches iterated from TRUE to FALSE
if (class(fit) == "MxRAMModel") {
if (OpenMx::imxHasDefinitionVariable(fit) & iterate) {
warning("Iterated IPC regression is not available for OpenMx models with definition variables. Standard IPC regression is carried out instead of iterated IPC regression")
iterated <- FALSE
}
}
# Storing object for output --------
## Model parameters
param_estimates <- coef_ipcr(fit)
q <- length(param_estimates)
param_names <- names(param_estimates)
## ipcr object
IPC <- list("info" = list(name = deparse(substitute(fit)),
class = class(fit)[[1]],
parameters = param_names,
covariates = info_covariates,
iterate = iterate,
iterated_status = NULL,
iteration_info = iteration_info,
conv = conv,
max_it = max_it,
regularization = regularization,
s = s,
alpha = alpha,
weights = weights,
nlambda = nlambda,
standardize = standardize,
nfolds = nfolds,
linear_MxModel = linear_MxModel))
# Iterated IPC regression --------
if (iterate) {
if(covariate_status != 0) {
stop("Error: Iterated IPC regression requires complete covariates.")
}
IPC <- iterated_ipcr(fit, IPC = IPC, covariates = covariates,
iteration_info = iteration_info, conv = conv, max_it = max_it,
linear_MxModel = linear_MxModel)
} else { ### Begin: perform standard IPC regression ###
# Standard IPC regression --------
## Information from fitted object
n <- nobs(fit)
q <- length(param_estimates)
## Compute score
scores <- estfun_ipcr(fit)
bread_matrix <- bread_ipcr(fit)
IPCs <- data.frame(matrix(param_estimates, nrow = n, ncol = q, byrow = TRUE) +
scores %*% t(bread_matrix))
colnames(IPCs) <- param_names
IPC$IPCs <- IPCs
## Perform IPC regression
if (covariate_status %in% c(0, 1)) {
ipcr_data <- cbind(IPCs, covariates)
param_names_ipcr <- paste0("IPCs_", gsub("\\(|\\)|\\]|\\[", "", param_names))
param_names_ipcr <- gsub(pattern = "~~", replacement = ".WITH.", x = param_names_ipcr)
param_names_ipcr <- gsub(pattern = "=~", replacement = ".BY.", x = param_names_ipcr)
param_names_ipcr <- gsub(pattern = "~", replacement = ".ON.", x = param_names_ipcr)
param_names_ipcr <- gsub(pattern = ",", replacement = ".", x = param_names_ipcr)
colnames(ipcr_data)[seq_len(q)] <- param_names_ipcr
IV <- paste(colnames(covariates), collapse = " + ")
ipcr_list <- lapply(param_names_ipcr, FUN = function(x) {
do.call(what = "lm",
args = list(formula = paste(x, "~", IV), data = as.name("ipcr_data")))
})
names(ipcr_list) <- param_names
IPC$regression_list <- ipcr_list
}
} # End: perform standard IPC regression
# Regularization --------
if (regularization) {
# Re-code characters and factors into dummy variables
if (any(unlist(lapply(covariates, function(x) {is.character(x) | is.factor(x)})))) {
covariates_strings <- Filter(function(x) {is.character(x) | is.factor(x)}, covariates)
string_formula <- paste(colnames(covariates_strings), collapse = "+")
dummies <- stats::model.matrix(stats::formula(paste("~", string_formula)), data = covariates_strings)[, -1]
covariates <- covariates[, !unlist(lapply(covariates, function(x) {is.character(x) | is.factor(x)}))]
covariates <- cbind(covariates, dummies)
}
IPC$regularized_regression_list <- lapply(IPC$IPCs, FUN = function(y) {
glmnet::cv.glmnet(x = as.matrix(covariates), y = y, alpha = alpha,
weights = weights, nlambda = nlambda, standardize = standardize,
nfolds = nfolds)})
}
# Prepare output --------
IPC$output$info <- print_info(IPC$info)
if (is.list(IPC$regression_list)) {
IPC$output$coefficients_matrix <- coefficients_matrix(IPC)
}
if (is.list(IPC$regularized_regression_list)) {
IPC$output$regularized_coefficients_matrix <- regularized_coefficents_matrix(IPC)
}
class(IPC) <- "ipcr"
return(IPC)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.