Nothing
#' Within-Group and Between-Group Correlation Matrix
#'
#' This function is a wrapper function for computing the within-group and
#' between-group correlation matrix by calling the \code{sem} function in the
#' R package \pkg{lavaan} and provides standard errors, z test statistics, and
#' significance values (\emph{p}-values) for testing the hypothesis
#' H0: \eqn{\rho} = 0 for all pairs of variables within and between groups.
#'
#' The specification of the within-group and between-group variables is in line
#' with the syntax in Mplus. That is, the \code{within} argument is used to
#' identify the variables in the matrix or data frame specified in \code{x} that
#' are measured on the individual level and modeled only on the within level.
#' They are specified to have no variance in the between part of the model. The
#' \code{between} argument is used to identify the variables in the matrix or
#' data frame specified in \code{x} that are measured on the cluster level and
#' modeled only on the between level. Variables not mentioned in the arguments
#' \code{within} or \code{between} are measured on the individual level and will
#' be modeled on both the within and between level.
#'
#' The function uses maximum likelihood estimation with conventional standard
#' errors (\code{estimator = "ML"}) which are not robust against non-normality
#' and full information maximum likelihood (FIML) method (\code{missing = "fiml"})
#' to deal with missing data by default. FIML method cannot be used when
#' within-group variables have no variance within some clusters. In this cases,
#' the function
#' will switch to listwise deletion. Note that the current lavaan version 0.6-11
#' supports FIML method only for maximum likelihood estimation with conventional
#' standard errors (\code{estimator = "ML"}) in multilevel models. Maximum
#' likelihood estimation with Huber-White robust standard errors
#' (\code{estimator = "MLR"}) uses listwise deletion to deal with missing data.
#' When using FIML method there might be issues in model convergence, which might
#' be resolved by switching to listwise deletion (\code{missing = "listwise"}).
#'
#' The lavaan package uses a quasi-Newton optimization method (\code{"nlminb"})
#' by default. If the optimizer does not converge, model estimation will switch
#' to the Expectation Maximization (EM) algorithm.
#'
#' Statistically significant correlation coefficients can be shown in boldface
#' on the console when specifying \code{sig = TRUE}. However, this option is not
#' supported when using R Markdown, i.e., the argument \code{sig} will switch to
#' \code{FALSE}.
#'
#' Adjustment method for multiple testing when specifying the argument \code{p.adj}
#' is applied to the within-group and between-group correlation matrix separately.
#'
#' @param ... a matrix or data frame. Alternatively, an expression
#' indicating the variable names in \code{data} e.g.,
#' \code{multilevel.cor(x1, x2, x3, data = dat)}. Note that
#' the operators \code{.}, \code{+}, \code{-}, \code{~},
#' \code{:}, \code{::}, and \code{!} can also be used to
#' select variables, see 'Details' in the \code{\link{df.subset}}
#' function.
#' @param data a data frame when specifying one or more variables in the
#' argument \code{...}. Note that the argument is \code{NULL}
#' when specifying a matrix or data frameme for the argument
#' \code{...}.
#' @param cluster either a character string indicating the variable name of
#' the cluster variable in \code{...} or \code{data}, or a
#' vector representing the nested grouping structure (i.e.,
#' group or cluster variable).
#' @param within a character vector representing variables that are measured
#' on the within level and modeled only on the within level.
#' Variables not mentioned in \code{within} or \code{between}
#' are measured on the within level and will be modeled on both
#' the within and between level.
#' @param between a character vector representing variables that are measured
#' on the between level and modeled only on the between level.
#' Variables not mentioned in \code{within} or \code{between}
#' are measured on the within level and will be modeled on
#' both the within and between level.
#' @param estimator a character string indicating the estimator to be used:
#' \code{"ML"} (default) for maximum likelihood with
#' conventional standard errors and \code{"MLR"} for maximum
#' likelihood with Huber-White robust standard errors. Note
#' that by default, full information maximum likelihood (FIML)
#' method is used to deal with missing data when using
#' \code{"ML"} (\code{missing = "fiml"}), whereas incomplete
#' cases are removed listwise (i.e., \code{missing = "listwise"})
#' when using \code{"MLR"}.
#' @param optim.method a character string indicating the optimizer, i.e., \code{nlminb}
#' (default) for the unconstrained and bounds-constrained
#' quasi-Newton method optimizer and \code{"em"} for the
#' Expectation Maximization (EM) algorithm.
#' @param missing a character string indicating how to deal with missing
#' data, i.e., \code{"listwise"} for listwise deletion or
#' \code{"fiml"} (default) for full information maximum
#' likelihood (FIML) method. Note that FIML method is only
#' available when \code{estimator = "ML"}. Note that it takes
#' longer to estimate the model when using FIML and using FIML
#' might cause issues in model convergence, these issues might
#' be resolved by switching to listwise deletion.
#' @param sig logical: if \code{TRUE}, statistically significant
#' correlation coefficients are shown in boldface on the
#' console.
#' @param alpha a numeric value between 0 and 1 indicating the significance
#' level at which correlation coefficients are printed
#' boldface when \code{sig = TRUE}.
#' @param print a character string or character vector indicating which
#' results to show on the console, i.e. \code{"all"} for all
#' results, \code{"cor"} for correlation coefficients,
#' \code{"se"} for standard errors, \code{"stat"} for z test
#' statistics, and \code{"p"} for \emph{p}-values.
#' @param split logical: if \code{TRUE}, output table is split in
#' within-group and between-group correlation matrix.
#' @param order logical: if \code{TRUE}, variables in the output table are
#' ordered, so that variables specified in the argument
#' \code{between} are shown first.
#' @param tri a character string indicating which triangular of the
#' matrix to show on the console when \code{split = TRUE},
#' i.e., \code{both} for upper and \code{upper} for the upper
#' triangular.
#' @param tri.lower logical: if \code{TRUE} (default) and \code{split = FALSE}
#' (default), within-group correlations are shown in the lower
#' triangular and between-group correlation are shown in the
#' upper triangular.
#' @param p.adj a character string indicating an adjustment method for
#' multiple testing based on \code{\link{p.adjust}}, i.e.,
#' \code{none} (default), \code{bonferroni}, \code{holm},
#' \code{hochberg}, \code{hommel}, \code{BH}, \code{BY}, or
#' \code{fdr}.
#' @param digits an integer value indicating the number of decimal places
#' to be used for displaying correlation coefficients.
#' @param p.digits an integer value indicating the number of decimal places
#' to be used for displaying \emph{p}-values.
#' @param as.na a numeric vector indicating user-defined missing values,
#' i.e. these values are converted to \code{NA} before
#' conducting the analysis. Note that \code{as.na()} function
#' is only applied to \code{x} but not to \code{cluster}.
#' @param write a character string naming a file for writing the output into
#' either a text file with file extension \code{".txt"} (e.g.,
#' \code{"Output.txt"}) or Excel file with file extension
#' \code{".xlsx"} (e.g., \code{"Output.xlsx"}). If the file
#' name does not contain any file extension, an Excel file will
#' be written.
#' @param append logical: if \code{TRUE} (default), output will be appended
#' to an existing text file with extension \code{.txt} specified
#' in \code{write}, if \code{FALSE} existing text file will be
#' overwritten.
#' @param check logical: if \code{TRUE} (default), argument specification,
#' convergence and model identification is checked.
#' @param output logical: if \code{TRUE} (default), output is shown on the
#' console.
#'
#' @author
#' Takuya Yanagida \email{takuya.yanagida@@univie.ac.at}
#'
#' @seealso
#' \code{\link{multilevel.descript}}, \code{\link{multilevel.icc}},
#' \code{\link{multilevel.cfa}}, \code{\link{cluster.scores}},
#' \code{\link{write.result}}
#'
#' @references
#' Hox, J., Moerbeek, M., & van de Schoot, R. (2018). \emph{Multilevel analysis:
#' Techniques and applications} (3rd. ed.). Routledge.
#'
#' Snijders, T. A. B., & Bosker, R. J. (2012). \emph{Multilevel analysis: An
#' introduction to basic and advanced multilevel modeling} (2nd ed.). Sage
#' Publishers.
#'
#' @return
#' Returns an object of class \code{misty.object}, which is a list with following
#' entries:
#' \tabular{ll}{
#' \code{call} \tab function call \cr
#' \code{type} \tab type of analysis \cr
#' \code{data} \tab data frame specified in \code{x} including the group variable
#' specified in \code{cluster} \cr
#' \code{args} \tab specification of function arguments \cr
#' \code{model} \tab specified model \cr
#' \code{model.fit} \tab fitted lavaan object (\code{model.fit}) \cr
#' \code{check} \tab results of the convergence and model identification check \cr
#' \code{result} \tab list with result tables \cr
#' }
#'
#' @note
#' The function uses the functions \code{sem}, \code{lavInspect},
#' \code{lavMatrixRepresentation}, \code{lavTech}, \code{parameterEstimates},
#' and \code{standardizedsolution} provided in the R package \pkg{lavaan} by
#' Yves Rosseel (2012).
#'
#' @export
#'
#' @examples
#' \dontrun{
#' # Load data set "Demo.twolevel" in the lavaan package
#' data("Demo.twolevel", package = "lavaan")
#'
#' #----------------------------------------------------------------------------
#' # Cluster variable specification
#'
#' # Example 1a: Cluster variable 'cluster' in 'x'
#' multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3", "cluster")], cluster = "cluster")
#'
#' # Example 1b: Cluster variable 'cluster' not in 'x'
#' multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")], cluster = Demo.twolevel$cluster)
#'
#' # Example 1c: Alternative specification using the 'data' argument
#' multilevel.cor(x1:x3, data = Demo.twolevel, cluster = "cluster")
#'
#' #----------------------------------------------------------------------------
#' # Example 2: All variables modeled on both the within and between level
#' # Highlight statistically significant result at alpha = 0.05
#' multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")], sig = TRUE,
#' cluster = Demo.twolevel$cluster)
#'
#' # Example 3: Split output table in within-group and between-group correlation matrix.
#' multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
#' cluster = Demo.twolevel$cluster, split = TRUE)
#'
#' # Example 4: Print correlation coefficients, standard errors, z test statistics,
#' # and p-values
#' multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
#' cluster = Demo.twolevel$cluster, print = "all")
#'
#' # Example 5: Print correlation coefficients and p-values
#' # significance values with Bonferroni correction
#' multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
#' cluster = Demo.twolevel$cluster, print = c("cor", "p"),
#' p.adj = "bonferroni")
#'
#' #----------------------------------------------------------------------------
#' # Example 6: Variables "y1", "y2", and "y2" modeled on both the within and between level
#' # Variables "w1" and "w2" modeled on the cluster level
#' multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3", "w1", "w2")],
#' cluster = Demo.twolevel$cluster,
#' between = c("w1", "w2"))
#'
#' # Example 7: Show variables specified in the argument 'between' first
#' multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3", "w1", "w2")],
#' cluster = Demo.twolevel$cluster,
#' between = c("w1", "w2"), order = TRUE)
#'
#' #----------------------------------------------------------------------------
#' # Example 8: Variables "y1", "y2", and "y2" modeled only on the within level
#' # Variables "w1" and "w2" modeled on the cluster level
#' multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3", "w1", "w2")],
#' cluster = Demo.twolevel$cluster,
#' within = c("y1", "y2", "y3"), between = c("w1", "w2"))
#'
#' #----------------------------------------------------------------------------
#' # Example 9: lavaan model and summary of the multilevel model used to compute the
#' # within-group and between-group correlation matrix
#'
#' mod <- multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
#' cluster = Demo.twolevel$cluster, output = FALSE)
#'
#' # lavaan model syntax
#' mod$model
#'
#' # Fitted lavaan object
#' lavaan::summary(mod$model.fit, standardized = TRUE)
#'
#' #----------------------------------------------------------------------------
#' # Write Results
#'
#' # Example 10a: Write Results into a text file
#' multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
#' cluster = Demo.twolevel$cluster,
#' write = "Multilevel_Correlation.txt")
#'
#' # Example 10b: Write Results into a Excel file
#' multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
#' cluster = Demo.twolevel$cluster,
#' write = "Multilevel_Correlation.xlsx")
#'
#' result <- multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
#' cluster = Demo.twolevel$cluster, output = FALSE)
#' write.result(result, "Multilevel_Correlation.xlsx")
#' }
multilevel.cor <- function(..., data = NULL, cluster, within = NULL, between = NULL,
estimator = c("ML", "MLR"), optim.method = c("nlminb", "em"),
missing = c("listwise", "fiml"), sig = FALSE, alpha = 0.05,
print = c("all", "cor", "se", "stat", "p"), split = FALSE,
order = FALSE, tri = c("both", "lower", "upper"), tri.lower = TRUE,
p.adj = c("none", "bonferroni", "holm", "hochberg", "hommel", "BH", "BY", "fdr"),
digits = 2, p.digits = 3, as.na = NULL, write = NULL,
append = TRUE, check = TRUE, output = TRUE) {
#_____________________________________________________________________________
#
# Initial Check --------------------------------------------------------------
# Check if input '...' is missing
if (isTRUE(missing(...))) { stop("Please specify the argument '...'.", call. = FALSE) }
# Check if input '...' is NULL
if (isTRUE(is.null(substitute(...)))) { stop("Input specified for the argument '...' is NULL.", call. = FALSE) }
# Check if input 'data' is data frame
if (isTRUE(!is.null(data) && !is.data.frame(data))) { stop("Please specify a data frame for the argument 'data'.", call. = FALSE) }
# Check input 'cluster'
if (isTRUE(missing(cluster))) { stop("Please specify a variable name or vector representing the grouping structure for the argument 'cluster'.", call. = FALSE) }
# Check if input 'cluster' is NULL
if (isTRUE(is.null(cluster))) { stop("Input specified for the argument 'cluster' is NULL.", call. = FALSE) }
#_____________________________________________________________________________
#
# Data -----------------------------------------------------------------------
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Data using the argument 'data' ####
if (isTRUE(!is.null(data))) {
# Variable names
var.names <- .var.names(..., data = data, cluster = cluster, check.chr = "a matrix or data frame")
# Extract data
x <- data[, var.names]
# Cluster variable
cluster <- data[, cluster]
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Data without using the argument 'data' ####
} else {
# Extract data
x <- eval(..., enclos = parent.frame())
# Data and cluster
var.group <- .var.group(data = x, cluster = cluster)
# Data
if (isTRUE(!is.null(var.group$data))) { x <- var.group$data }
# Cluster variable
if (isTRUE(!is.null(var.group$cluster))) { cluster <- var.group$cluster }
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Within- and Between-Group Variables ####
# Within variables in 'x'
within.miss <- !within %in% colnames(x)
if (isTRUE(any(within.miss))) { stop(paste0(ifelse(length(which(within.miss)) == 1L, "Variable ", "Variables "), "specified in the argument 'within' ", ifelse(length(which(within.miss)) == 1L, "was ", "were "), "not found in 'x': ", within[which(within.miss)], collapse = ", "), call. = FALSE) } # Within variables
if (isTRUE(is.null(between))) { var.with <- colnames(x) } else { var.with <- colnames(x)[!colnames(x) %in% between] }
# At least two within variables
if (isTRUE(length(var.with) <= 1L)) { stop("Please specify at least two within-group variables.", call. = FALSE) }
# Between variables in 'x'
between.miss <- !between %in% colnames(x)
if (isTRUE(any(between.miss))) { stop(paste0(ifelse(length(which(between.miss)) == 1L, "Variable ", "Variables "), "specified in the argument 'between' ", ifelse(length(which(between.miss)) == 1L, "was ", "were "), "not found in 'x': ", within[which(between.miss)], collapse = ", "), call. = FALSE) }
# Variance within clusters
x.check <- vapply(x[, between, drop = FALSE], function(y) any(tapply(y, cluster, var, na.rm = TRUE) != 0L), FUN.VALUE = logical(1L))
if (isTRUE(any(x.check))) { warning(paste0("Following between-group ", ifelse(length(which(x.check)) == 1L, "variable has ", "variables have "), "variance within clusters: ", paste(names(which(x.check)), collapse = ", ")), call. = FALSE) }
# Between variables
if (isTRUE(is.null(within))) { var.betw <- colnames(x) } else { var.betw <- colnames(x)[!colnames(x) %in% within] }
# At least two between variables
if (isTRUE(length(var.betw) <= 1L)) { stop("Please specify at least two between-group variables.", call. = FALSE) }
# Variables in 'within' or 'between'
wb.inter <- intersect(within, between)
if (isTRUE(length(wb.inter) > 0L)) { warning(paste0("Following ", ifelse(length(wb.inter) == 1L, "variable is ", "variables are "), "specified in both arguments 'within' and 'between': ", paste(wb.inter, collapse = ", ")), call. = FALSE) }
# Unique variables at Within and Between
var <- unique(c(var.with, var.betw))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Data frame with Cluster Variable ####
x <- data.frame(x[, var], .cluster = cluster, stringsAsFactors = FALSE)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Convert user-missing values into NA ####
if (isTRUE(!is.null(as.na))) { x[, var] <- .as.na(x[, var], na = as.na) }
#_____________________________________________________________________________
#
# Input Check ----------------------------------------------------------------
# Check input 'check'
if (isTRUE(!is.logical(check))) { stop("Please specify TRUE or FALSE for the argument 'check'.", call. = FALSE) }
if (isTRUE(check)) {
if (isTRUE(!nzchar(system.file(package = "lavaan")))) { stop("Package \"lavaan\" is needed for this function, please install the package.", call. = FALSE) }
# Check input 'x': Zero variance?
x.check <- vapply(x, function(y) length(na.omit(unique(y))) == 1L, FUN.VALUE = logical(1L))
if (isTRUE(any(x.check))) { warning(paste0("Following ", ifelse(length(which(x.check)) == 1L, "variable ", "variables "), "in the matrix or data frame specified in 'x' ", ifelse(length(which(x.check)) == 1L, "has ", "have "), "zero variance: ", paste(names(which(x.check)), collapse = ", ")), call. = FALSE) }
# Check input 'within'
x.check <- vapply(x[, var.with, drop = FALSE], function(y) all(tapply(y, x$.cluster, var, na.rm = TRUE) == 0L), FUN.VALUE = logical(1L))
if (isTRUE(any(x.check))) { warning(paste0("Following within-group ", ifelse(length(which(x.check)) == 1L, "variable has ", "variables have "), "zero variance within all clusters: ", paste(names(which(x.check)), collapse = ", ")), call. = FALSE) }
# Check input 'estimator'
if (isTRUE(any(!estimator %in% c("ML", "MLR")))) { stop("Character string in the argument 'estimator' does not match with \"ML\" or \"MLR\".", call. = FALSE) }
# Check input 'optim.method'
if (isTRUE(!all(optim.method %in% c("nlminb", "em")))) { stop("Character string in the argument 'optim.method' does not match with \"nlminb\" or \"em\".", call. = FALSE) }
# Check input 'missing'
if (isTRUE(any(!missing %in% c("listwise", "fiml")))) { stop("Character string in the argument 'estimator' does not match with \"listwise\" or \"fiml\".", call. = FALSE) }
# Check input 'sig'
if (isTRUE(!is.logical(sig))) { stop("Please specify TRUE or FALSE for the argument 'sig'.", call. = FALSE) }
# Check input 'alpha'
if (isTRUE(alpha >= 1L || alpha <= 0L)) { stop("Please specify a number between 0 and 1 for the argument 'alpha'.", call. = FALSE) }
# Check input 'print'
if (isTRUE(any(!print %in% c("all", "cor", "se", "stat", "p")))) { stop("Character string(s) in the argument 'print' does not match with \"all\", \"cor\", \"se\", \"stat\", or \"p\".", call. = FALSE) }
# Check input 'split'
if (isTRUE(!is.logical(split))) { stop("Please specify TRUE or FALSE for the argument 'split'.", call. = FALSE) }
# Check input 'order'
if (isTRUE(!is.logical(order))) { stop("Please specify TRUE or FALSE for the argument 'order'.", call. = FALSE) }
# Check input 'tri'
if (isTRUE(any(!tri %in% c("both", "lower", "upper")))) { stop("Character string in the argument 'tri' does not match with \"both\", \"lower\", or \"upper\".", call. = FALSE) }
# Check input 'tri.lower'
if (isTRUE(!is.logical(tri.lower))) { stop("Please specify TRUE or FALSE for the argument 'tri.lower'.", call. = FALSE) }
# Check input 'p.adj'
if (isTRUE(any(!p.adj %in% c("none", "holm", "bonferroni", "hochberg", "hommel", "BH", "BY", "fdr")))) { stop("Character string in the argument 'p.adj' does not match with \"none\", \"bonferroni\", \"holm\", \"hochberg\", \"hommel\", \"BH\", \"BY\", or \"fdr\".", call. = FALSE) }
# Check input 'digits'
if (isTRUE(digits %% 1L != 0L || digits < 0L)) { stop("Please specify a positive integer number for the argument 'digits'.", call. = FALSE) }
# Check input 'p.digits'
if (isTRUE(p.digits %% 1L != 0L || p.digits < 0L)) { stop("Please specify a positive integer number for the argument 'p.digits'.", call. = FALSE) }
# Check input 'append'
if (isTRUE(!is.logical(append))) { stop("Please specify TRUE or FALSE for the argument 'append'.", call. = FALSE) }
# Check input 'output'
if (isTRUE(!is.logical(output))) { stop("Please specify TRUE or FALSE for the argument 'output'.", call. = FALSE) }
}
#_____________________________________________________________________________
#
# Arguments ------------------------------------------------------------------
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Estimator ####
estimator <- ifelse(all(c("ML", "MLR") %in% estimator), "ML", estimator)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Optimizer ####
if (isTRUE(all(c("nlminb", "em") %in% optim.method))) { optim.method <- "nlminb" }
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Missing data ####
#...................
### Missing values present ####
if (isTRUE(any(is.na(x)))) {
complete <- FALSE
# ML estimation with FIML
if (isTRUE(estimator == "ML")) {
missing <- ifelse(all(c("listwise", "fiml") %in% missing), "fiml", missing)
# MLR estimation
} else if (isTRUE(estimator == "MLR")) {
if (isTRUE(missing == "fiml")) {
warning("FIML method is not available with estimator = \"MLR\", argument missing switched to \"listwise\".", call. = FALSE)
}
# Listwise deletion
missing <- "listwise"
}
#...................
### No missing values ####
} else {
complete <- TRUE
missing <- "listwise"
}
# Cases with missing on all variables
if (isTRUE(missing == "fiml")) {
x.na.prop <- misty::na.prop(x[, -which(colnames(x) %in% c(".cluster", between)), drop = FALSE])
if (any(x.na.prop == 1L)) {
warning(paste("Data set contains", sum(x.na.prop == 1L), "cases with missing on all variables that are measured on the within level which were not included in the analysis."), call. = FALSE)
# Remove cases with missing on all variables
x <- x[which(x.na.prop < 1L), ]
}
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Print correlation, sample size or significance values ####
if (isTRUE(all(c("all", "cor", "se", "stat", "p") %in% print))) { print <- "cor" }
if (isTRUE(length(print) == 1L && "all" %in% print)) { print <- c("cor", "se", "stat", "p") }
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Print triangular ####
tri <- ifelse(all(c("both", "lower", "upper") %in% tri), "lower", tri)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Adjustment method for multiple testing ####
p.adj <- ifelse(all(c("none", "bonferroni", "holm", "hochberg", "hommel", "BH", "BY", "fdr") %in% p.adj), "none", p.adj)
#_____________________________________________________________________________
#
# Main Function --------------------------------------------------------------
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Model specification ####
mod <- paste(" level: 1\n ",
# Within model
paste(apply(combn(length(var.with), 2L), 2L, function(y) paste(var.with[y[1L]], var.with[y[2L]], sep = " ~~ " )), collapse = "\n "),
"\n level: 2\n ",
# Between model
paste(apply(combn(length(var.betw), 2L), 2L, function(y) paste(var.betw[y[1L]], var.betw[y[2L]], sep = " ~~ " )), collapse = "\n "))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Model estimation ####
model.fit <- tryCatch(suppressWarnings(lavaan::sem(mod, data = x, cluster = ".cluster", estimator = estimator,
missing = missing, optim.method = optim.method,
se = ifelse(estimator == "MLR", "robust.huber.white", "standard"),
check.gradient = FALSE, check.post = FALSE, check.vcov = FALSE)),
error = function(y) {
if (isTRUE(missing == "fiml")) {
stop("There was an estimation problem in lavaan, switching to missing = \"listwise\" might solve the problem.", call. = FALSE)
} else if (isTRUE(estimator == "MLR")){
stop("There was an estimation problem in lavaan, switching to estimator = \"ML\" might solve the problem.", call. = FALSE)
} else {
stop("There was an estimation problem in lavaan, correlation matrix could not be computed.", call. = FALSE)
}})
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Convergence check ####
if (isTRUE(!lavaan::lavInspect(model.fit, what = "converged"))) {
#...................
### Quasi-Newton method optimizer ####
if (isTRUE(optim.method == "nlminb")) {
message("Quasi-Newton optimizer did not converge, switched to the EM algorithm.")
# Model estimation with EM algorithm
model.fit <- suppressWarnings(lavaan::sem(mod, data = x, cluster = ".cluster", estimator = estimator,
missing = missing, optim.method = "em",
se = ifelse(estimator == "MLR", "robust.huber.white", "standard"),
check.gradient = FALSE, check.post = FALSE, check.vcov = FALSE))
# Model not converged
if (isTRUE(!lavaan::lavInspect(model.fit, what = "converged"))) {
if (isTRUE(missing == "listwise")) {
stop("EM algorithm did not converge.", call. = FALSE)
} else {
stop("EM algorithm did not converge. Switching to missing = \"listwise\" might solve the estimation problem.", call. = FALSE)
}
}
#...................
### Expectation Maximization (EM) algorithm ####
} else if(isTRUE(optim.method == "em")) {
if (isTRUE(missing == "listwise")) {
stop("EM algorithm did not converge.", call. = FALSE)
} else {
stop("EM algorithm did not converge. Switching to missing = \"listwise\" might solve the estimation problem.", call. = FALSE)
}
}
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Convergence and model identification checks ####
if (isTRUE(check)) {
check.vcov <- check.theta.w <- check.theta.b <- check.cov.lv.w <- check.cov.lv.b <- TRUE
#...................
### Standard error ####
if (isTRUE(any(is.na(unlist(lavaan::lavInspect(model.fit, what = "se")))))) { stop("Standard errors could not be computed.", call. = FALSE) }
#...................
### Variance-covariance matrix of the estimated parameters ####
eigvals <- eigen(lavaan::lavInspect(model.fit, what = "vcov"), symmetric = TRUE, only.values = TRUE)$values
# Model contains equality constraints
model.fit.par <- lavaan::parameterTable(model.fit)$op == "=="
if (isTRUE(any(model.fit.par))) { eigvals <- rev(eigvals)[-seq_len(sum(model.fit.par))] }
if (isTRUE(min(eigvals) < .Machine$double.eps^(3L/4L))) {
warning("The variance-covariance matrix of the estimated parameters is not positive definite. This may be a symptom that the model is not identified.", call. = FALSE)
check.vcov <- FALSE
}
#...................
### Negative variance of observed variables ####
#### Within Level
if (isTRUE(any(diag(lavaan::lavInspect(model.fit, what = "theta")$within) < 0L))) {
warning("Some estimated variances of the observed variables at the Within level are negative.", call. = FALSE)
check.theta.w <- FALSE
} else if (isTRUE(any(eigen(lavaan::lavTech(model.fit, what = "theta")$within, symmetric = TRUE, only.values = TRUE)$values < (-1L * .Machine$double.eps^(3/4))))) {
warning("The model-implied variance-covariance matrix of the residuals of the observed variables is not positive definite indicating an absolute residual correlations greater one.", call. = FALSE)
check.theta.w <- FALSE
}
#### Between Level
if (isTRUE(any(diag(lavaan::lavInspect(model.fit, what = "theta")$.cluster) < 0L))) {
warning("Some estimated variances of the observed variables at the Between level are negative.", call. = FALSE)
check.theta.b <- FALSE
} else if (isTRUE(any(eigen(lavaan::lavTech(model.fit, what = "theta")$.cluster, symmetric = TRUE, only.values = TRUE)$values < (-1L * .Machine$double.eps^(3/4))))) {
warning("The model-implied variance-covariance matrix of the residuals of the observed variables at the Between level is not positive definite indicating an absolute residual correlations greater one.", call. = FALSE)
check.theta.b <- FALSE
}
#...................
### Negative variance of latent variables ####
#### Within Level
if (isTRUE(!is.null(lavaan::lavTech(model.fit, what = "cov.lv")$within))) {
if (isTRUE(any(diag(lavaan::lavTech(model.fit, what = "cov.lv")$within) < 0L))) {
warning("Some estimated variances of the latent variables at the Within level are negative.", call. = FALSE)
check.cov.lv.w <- FALSE
}
# Model-implied variance-covariance matrix of the latent variables
} else if (any(dim(lavaan::lavTech(model.fit, what = "cov.lv")$within) != 0L)) {
if (isTRUE(any(eigen(lavaan::lavTech(model.fit, what = "cov.lv")$within, symmetric = TRUE, only.values = TRUE)$values < (-1L * .Machine$double.eps^(3/4))))) {
warning("The model-implied variance-covariance matrix of the latent variables at the Within level is not positive definite indicating an absolute correlation greater one.", call. = FALSE)
check.cov.lv.w <- FALSE
}
}
#### Between Level
if (isTRUE(!is.null(lavaan::lavTech(model.fit, what = "cov.lv")$cluster))) {
if (isTRUE(any(diag(lavaan::lavTech(model.fit, what = "cov.lv")$.cluster) < 0L))) {
warning("Some estimated variances of the latent variables at the Between level are negative.", call. = FALSE)
check.cov.lv.b <- FALSE
}
# Model-implied variance-covariance matrix of the latent variables
} else if (any(dim(lavaan::lavTech(model.fit, what = "cov.lv")$.cluster) != 0L)) {
if (isTRUE(any(eigen(lavaan::lavTech(model.fit, what = "cov.lv")$.cluster, symmetric = TRUE, only.values = TRUE)$values < (-1L * .Machine$double.eps^(3/4))))) {
warning("The model-implied variance-covariance matrix of the latent variables at the Between level is not positive definite indicating an absolute correlation greater one.", call. = FALSE)
check.cov.lv.b <- FALSE
}
}
} else {
check.vcov <- check.theta.w <- check.theta.b <- check.cov.lv.w <- check.cov.lv.b <- NULL
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Extract Results ####
# Standardized solution
stand <- lavaan::lavMatrixRepresentation(lavaan::standardizedSolution(model.fit))
# Visible binding for global variable
mat <- level <- NULL
#...................
### Within-Group Results ####
# Theta
with.stand.theta <- subset(stand[ unlist(subset(lavaan::lavMatrixRepresentation(lavaan::parameterestimates(model.fit)), level == 1L, select = "id")), ], mat == "theta")
# Parameter estimate, z and significance value matrix
with.p <- with.stat <- with.se <- with.cor <- matrix(NA, ncol = max(stand[, "col"]), nrow = max(stand[, "row"]))
for (i in seq_len(nrow(with.stand.theta))) {
with.cor[with.stand.theta[i, "row"], with.stand.theta[i, "col"]] <- with.stand.theta[i, "est.std"]
with.se[with.stand.theta[i, "row"], with.stand.theta[i, "col"]] <- with.stand.theta[i, "se"]
with.stat[with.stand.theta[i, "row"], with.stand.theta[i, "col"]] <- with.stand.theta[i, "z"]
with.p[with.stand.theta[i, "row"], with.stand.theta[i, "col"]] <- with.stand.theta[i, "pvalue"]
}
with.cor[lower.tri(with.cor)] <- t(with.cor)[lower.tri(with.cor)]
with.se[lower.tri(with.se)] <- t(with.se)[lower.tri(with.se)]
with.stat[lower.tri(with.stat)] <- t(with.stat)[lower.tri(with.stat)]
with.p[lower.tri(with.p)] <- t(with.p)[lower.tri(with.p)]
colnames(with.cor) <- colnames(with.se) <- colnames(with.stat) <- colnames(with.p) <-
rownames(with.cor) <- rownames(with.se) <- rownames(with.stat) <- rownames(with.p) <- sapply(seq_len(max(stand[, "row"])), function(y) unique(stand[which(y == stand$row), "lhs"]))
#...................
### Between-Group Results ####
# Standardized solution
betw.stand.theta <- subset(stand[ unlist(subset(lavaan::lavMatrixRepresentation(lavaan::parameterestimates(model.fit)), level == 2L, select = "id")), ], mat == "theta")
# Parameter estimate, z and significance value matrix
betw.p <- betw.stat <- betw.se <- betw.cor <- matrix(NA, ncol = max(stand[, "col"]), nrow = max(stand[, "row"]))
for (i in seq_len(nrow(betw.stand.theta))) {
betw.cor[betw.stand.theta[i, "row"], betw.stand.theta[i, "col"]] <- betw.stand.theta[i, "est.std"]
betw.se[betw.stand.theta[i, "row"], betw.stand.theta[i, "col"]] <- betw.stand.theta[i, "se"]
betw.stat[betw.stand.theta[i, "row"], betw.stand.theta[i, "col"]] <- betw.stand.theta[i, "z"]
betw.p[betw.stand.theta[i, "row"], betw.stand.theta[i, "col"]] <- betw.stand.theta[i, "pvalue"]
}
betw.cor[lower.tri(betw.cor)] <- t(betw.cor)[lower.tri(betw.cor)]
betw.se[lower.tri(betw.se)] <- t(betw.se)[lower.tri(betw.se)]
betw.stat[lower.tri(betw.stat)] <- t(betw.stat)[lower.tri(betw.stat)]
betw.p[lower.tri(betw.p)] <- t(betw.p)[lower.tri(betw.p)]
colnames(betw.cor) <- colnames(betw.se) <- colnames(betw.stat) <- colnames(betw.p) <-
rownames(betw.cor) <- rownames(betw.se) <- rownames(betw.stat) <- rownames(betw.p) <- sapply(seq_len(max(stand[, "row"])), function(y) unique(stand[which(y == stand$row), "lhs"]))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Order Between Variables ####
if (isTRUE(order && !is.null(between))) {
pos.betw <- which(colnames(betw.cor) %in% between)
pos.with <- which(!colnames(betw.cor) %in% between)
with.cor <- with.cor[c(pos.betw, pos.with), c(pos.betw, pos.with)]
with.se <- with.se[c(pos.betw, pos.with), c(pos.betw, pos.with)]
with.stat <- with.stat[c(pos.betw, pos.with), c(pos.betw, pos.with)]
with.p <- with.p[c(pos.betw, pos.with), c(pos.betw, pos.with)]
betw.cor <- betw.cor[c(pos.betw, pos.with), c(pos.betw, pos.with)]
betw.se <- betw.se[c(pos.betw, pos.with), c(pos.betw, pos.with)]
betw.stat <- betw.stat[c(pos.betw, pos.with), c(pos.betw, pos.with)]
betw.p <- betw.p[c(pos.betw, pos.with), c(pos.betw, pos.with)]
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Combine Within-Group and Between-Group Results ####
#...................
### Within-group correlations in the lower triangular ####
if (isTRUE(tri.lower)) {
# Within-group results
wb.cor <- with.cor
wb.se <- with.se
wb.stat <- with.stat
wb.p <- with.p
# Between-group results
wb.cor[upper.tri(wb.cor)] <- betw.cor[upper.tri(wb.cor)]
wb.se[upper.tri(wb.se)] <- betw.se[upper.tri(wb.se)]
wb.stat[upper.tri(wb.stat)] <- betw.stat[upper.tri(wb.stat)]
wb.p[upper.tri(wb.p)] <- betw.p[upper.tri(wb.p)]
#...................
### Within-group correlations in the upper triangular ####
} else {
# Within-group results
wb.cor <- betw.cor
wb.se <- betw.se
wb.stat <- betw.stat
wb.p <- betw.p
# Between-group results
wb.cor[upper.tri(wb.cor)] <- with.cor[upper.tri(wb.cor)]
wb.se[upper.tri(wb.se)] <- with.se[upper.tri(wb.se)]
wb.stat[upper.tri(wb.stat)] <- with.stat[upper.tri(wb.stat)]
wb.p[upper.tri(wb.p)] <- with.p[upper.tri(wb.p)]
}
#...................
### Adjust p-values for multiple comparison ####
if (isTRUE(p.adj != "none")) {
wb.p[lower.tri(wb.p)] <- p.adjust(wb.p[lower.tri(wb.p)], method = p.adj)
wb.p[upper.tri(wb.p)] <- p.adjust(wb.p[upper.tri(wb.p)], method = p.adj)
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Split Within-Group and Between-Group Results ####
# Within-group results
with.cor <- with.cor[which(apply(with.cor, 1L, function(y) !all(is.na(y)))), which(apply(with.cor, 2L, function(y) !all(is.na(y))))]
with.se <- with.se[which(apply(with.se, 1L, function(y) !all(is.na(y)))), which(apply(with.se, 2L, function(y) !all(is.na(y))))]
with.stat <- with.stat[which(apply(with.stat, 1L, function(y) !all(is.na(y)))), which(apply(with.stat, 2L, function(y) !all(is.na(y))))]
with.p <- with.p[which(apply(with.p, 1L, function(y) !all(is.na(y)))), which(apply(with.p, 2L, function(y) !all(is.na(y))))]
# Between-group results
betw.cor <- betw.cor[which(apply(betw.cor, 1L, function(y) !all(is.na(y)))), which(apply(betw.cor, 2L, function(y) !all(is.na(y))))]
betw.se <- betw.se[which(apply(betw.se, 1L, function(y) !all(is.na(y)))), which(apply(betw.se, 2L, function(y) !all(is.na(y))))]
betw.stat <- betw.stat[which(apply(betw.stat, 1L, function(y) !all(is.na(y)))), which(apply(betw.stat, 2L, function(y) !all(is.na(y))))]
betw.p <- betw.p[which(apply(betw.p, 1L, function(y) !all(is.na(y)))), which(apply(betw.p, 2L, function(y) !all(is.na(y))))]
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Adjust p-values for multiple comparison ####
if (isTRUE(p.adj != "none")) {
with.p[lower.tri(with.p)] <- p.adjust(with.p[lower.tri(with.p)], method = p.adj)
with.p[upper.tri(with.p)] <- p.adjust(with.p[upper.tri(with.p)], method = p.adj)
betw.p[lower.tri(betw.p)] <- p.adjust(betw.p[lower.tri(betw.p)], method = p.adj)
betw.p[upper.tri(betw.p)] <- p.adjust(betw.p[upper.tri(betw.p)], method = p.adj)
}
#_____________________________________________________________________________
#
# Return object --------------------------------------------------------------
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## lavaan summary ####
lavaan.summary <- data.frame(c(paste("lavaan", lavaan::lavInspect(model.fit, what = "version")), "", "Estimator", "Standard Errors", "Missing Data", "", "",
"Number of Observations", "Number of Clusters"),
c("", "",
# Estimator
estimator,
# Standard errors
switch(lavaan::lavTech(model.fit, what = "options")$se,
"standard" = "Conventional",
"robust.huber.white" = "Huber-White"),
# Missing data
ifelse(isTRUE(complete), "None",
switch(missing,
"listwise" = "Listwise Deletion",
"fiml" = "FIML")), "", "Used",
# Number of observations
lavaan::lavInspect(model.fit, what = "nobs"),
# Number of clusters
lavaan::lavInspect(model.fit, what = "nclusters")),
c(rep("", times = 6L), "Total", lavaan::lavInspect(model.fit, what = "norig"), ""),
fix.empty.names = FALSE)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Object ####
object <- list(call = match.call(),
type = "multilevel.cor",
data = x,
args = list(within = within, between = between,
estimator = estimator, missing = missing,
sig = sig, alpha = alpha, print = print,
split = split, order = order, tri = tri, tri.lower = tri.lower,
p.adj = p.adj, digits = digits, p.digits = p.digits,
as.na = as.na, write = write, append = append, check = check, output = output),
model = mod,
model.fit = model.fit,
check = list(vcov = check.vcov, theta.w = check.theta.w, theta.b = check.theta.b,
cov.lv.w = check.cov.lv.w, cov.lv.b = check.cov.lv.b),
result = list(summary = lavaan.summary,
wb.cor = wb.cor, wb.se = wb.se,
wb.stat = wb.stat, wb.p = wb.p,
with.cor = with.cor, with.se = with.se,
with.stat = with.stat, with.p = with.p,
betw.cor = betw.cor, betw.se = betw.se,
betw.stat = betw.stat, betw.p = betw.p))
class(object) <- "misty.object"
#_____________________________________________________________________________
#
# Write Result ---------------------------------------------------------------
if (isTRUE(!is.null(write))) {
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Text file ####
if (isTRUE(grepl("\\.txt", write))) {
# Send R output to textfile
sink(file = write, append = ifelse(isTRUE(file.exists(write)), append, FALSE), type = "output", split = FALSE)
if (isTRUE(append && file.exists(write))) { write("", file = write, append = TRUE) }
# Print object
print(object, check = FALSE)
# Close file connection
sink()
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Excel file ####
} else {
misty::write.result(object, file = write)
}
}
#_____________________________________________________________________________
#
# Output ---------------------------------------------------------------------
if (isTRUE(output)) { print(object, check = FALSE) }
return(invisible(object))
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.