multilevel.cor: Within-Group and Between-Group Correlation Matrix

View source: R/multilevel.cor.R

multilevel.corR Documentation

Within-Group and Between-Group Correlation Matrix

Description

This function is a wrapper function for computing the within-group and between-group correlation matrix by calling the sem function in the R package lavaan and provides standard errors, z test statistics, and significance values (p-values) for testing the hypothesis H0: \rho = 0 for all pairs of variables within and between groups.

Usage

multilevel.cor(..., data = NULL, cluster, within = NULL, between = NULL,
               estimator = c("ML", "MLR"), optim.method = c("nlminb", "em"),
               missing = c("listwise", "fiml"), sig = FALSE, alpha = 0.05,
               print = c("all", "cor", "se", "stat", "p"), split = FALSE,
               order = FALSE, tri = c("both", "lower", "upper"), tri.lower = TRUE,
               p.adj = c("none", "bonferroni", "holm", "hochberg", "hommel",
                         "BH", "BY", "fdr"), digits = 2, p.digits = 3,
               as.na = NULL, write = NULL, append = TRUE, check = TRUE,
               output = TRUE)

Arguments

...

a matrix or data frame. Alternatively, an expression indicating the variable names in data e.g., multilevel.cor(x1, x2, x3, data = dat). Note that the operators ., +, -, ~, :, ::, and ! can also be used to select variables, see 'Details' in the df.subset function.

data

a data frame when specifying one or more variables in the argument .... Note that the argument is NULL when specifying a matrix or data frame for the argument ....

cluster

either a character string indicating the variable name of the cluster variable in ... or data, or a vector representing the nested grouping structure (i.e., group or cluster variable).

within

a character vector representing variables that are measured on the within level and modeled only on the within level. Variables not mentioned in within or between are measured on the within level and will be modeled on both the within and between level.

between

a character vector representing variables that are measured on the between level and modeled only on the between level. Variables not mentioned in within or between are measured on the within level and will be modeled on both the within and between level.

estimator

a character string indicating the estimator to be used: "ML" (default) for maximum likelihood with conventional standard errors and "MLR" for maximum likelihood with Huber-White robust standard errors. Note that by default, full information maximum likelihood (FIML) method is used to deal with missing data when using "ML" (missing = "fiml"), whereas incomplete cases are removed listwise (i.e., missing = "listwise") when using "MLR".

optim.method

a character string indicating the optimizer, i.e., nlminb (default) for the unconstrained and bounds-constrained quasi-Newton method optimizer and "em" for the Expectation Maximization (EM) algorithm.

missing

a character string indicating how to deal with missing data, i.e., "listwise" for listwise deletion or "fiml" (default) for full information maximum likelihood (FIML) method. Note that FIML method is only available when estimator = "ML". Note that it takes longer to estimate the model when using FIML and using FIML might cause issues in model convergence, these issues might be resolved by switching to listwise deletion.

sig

logical: if TRUE, statistically significant correlation coefficients are shown in boldface on the console.

alpha

a numeric value between 0 and 1 indicating the significance level at which correlation coefficients are printed boldface when sig = TRUE.

print

a character string or character vector indicating which results to show on the console, i.e. "all" for all results, "cor" for correlation coefficients, "se" for standard errors, "stat" for z test statistics, and "p" for p-values.

split

logical: if TRUE, output table is split in within-group and between-group correlation matrix.

order

logical: if TRUE, variables in the output table are ordered, so that variables specified in the argument between are shown first.

tri

a character string indicating which triangular of the matrix to show on the console when split = TRUE, i.e., both for upper and upper for the upper triangular.

tri.lower

logical: if TRUE (default) and split = FALSE (default), within-group correlations are shown in the lower triangular and between-group correlation are shown in the upper triangular.

p.adj

a character string indicating an adjustment method for multiple testing based on p.adjust, i.e., none (default), bonferroni, holm, hochberg, hommel, BH, BY, or fdr.

digits

an integer value indicating the number of decimal places to be used for displaying correlation coefficients.

p.digits

an integer value indicating the number of decimal places to be used for displaying p-values.

as.na

a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis. Note that as.na() function is only applied to x but not to cluster.

write

a character string naming a file for writing the output into either a text file with file extension ".txt" (e.g., "Output.txt") or Excel file with file extension ".xlsx" (e.g., "Output.xlsx"). If the file name does not contain any file extension, an Excel file will be written.

append

logical: if TRUE (default), output will be appended to an existing text file with extension .txt specified in write, if FALSE existing text file will be overwritten.

check

logical: if TRUE (default), argument specification is checked.

output

logical: if TRUE (default), output is shown on the console.

Details

The specification of the within-group and between-group variables is in line with the syntax in Mplus. That is, the within argument is used to identify the variables in the matrix or data frame specified in x that are measured on the individual level and modeled only on the within level. They are specified to have no variance in the between part of the model. The between argument is used to identify the variables in the matrix or data frame specified in x that are measured on the cluster level and modeled only on the between level. Variables not mentioned in the arguments within or between are measured on the individual level and will be modeled on both the within and between level.

The function uses maximum likelihood estimation with conventional standard errors (estimator = "ML") which are not robust against non-normality and full information maximum likelihood (FIML) method (missing = "fiml") to deal with missing data by default. FIML method cannot be used when within-group variables have no variance within some clusters. In this cases, the function will switch to listwise deletion. Note that the current lavaan version 0.6-11 supports FIML method only for maximum likelihood estimation with conventional standard errors (estimator = "ML") in multilevel models. Maximum likelihood estimation with Huber-White robust standard errors (estimator = "MLR") uses listwise deletion to deal with missing data. When using FIML method there might be issues in model convergence, which might be resolved by switching to listwise deletion (missing = "listwise").

The lavaan package uses a quasi-Newton optimization method ("nlminb") by default. If the optimizer does not converge, model estimation will switch to the Expectation Maximization (EM) algorithm.

Statistically significant correlation coefficients can be shown in boldface on the console when specifying sig = TRUE. However, this option is not supported when using R Markdown, i.e., the argument sig will switch to FALSE.

Adjustment method for multiple testing when specifying the argument p.adj is applied to the within-group and between-group correlation matrix separately.

Value

Returns an object of class misty.object, which is a list with following entries:

call

function call

type

type of analysis

data

data frame specified in x including the group variable specified in cluster

args

specification of function arguments

model.fit

fitted lavaan object (mod.fit)

result

list with result tables, i.e., summary for the specification of the estimation method and missing data handling in lavaan, wb.cor for the within- and between-group correlations, wb.se for the standard error of the within- and between-group correlations, wb.stat for the test statistic of within- and between-group correlations, wb.p for the significance value of the within- and between-group correlations, with.cor for the within-group correlations, with.se for the standard error of the within-group correlations, with.stat for the test statistic of within-group correlations, with.p for the significance value of the within-group correlations, betw.cor for the between-group correlations, betw.se for the standard error of the between-group correlations, betw.stat for the test statistic of between-group correlations, betw.p for the significance value of the between-group correlations

Note

The function uses the functions sem, lavInspect, lavMatrixRepresentation, lavTech, parameterEstimates, and standardizedsolution provided in the R package lavaan by Yves Rosseel (2012).

Author(s)

Takuya Yanagida takuya.yanagida@univie.ac.at

References

Hox, J., Moerbeek, M., & van de Schoot, R. (2018). Multilevel analysis: Techniques and applications (3rd. ed.). Routledge.

Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling (2nd ed.). Sage Publishers.

See Also

write.result, multilevel.descript, multilevel.icc, cluster.scores

Examples

## Not run: 
# Load data set "Demo.twolevel" in the lavaan package
data("Demo.twolevel", package = "lavaan")

#-------------------------------------------------------------------------------
# Cluster variable specification

# Example 1a: Cluster variable 'cluster' in 'x'
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3", "cluster")], cluster = "cluster")

# Example 1b: Cluster variable 'cluster' not in 'x'
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")], cluster = Demo.twolevel$cluster)

# Example 1c: Alternative specification using the 'data' argument
multilevel.cor(x1:x3, data = Demo.twolevel, cluster = "cluster")

#-------------------------------------------------------------------------------
# Example 2: All variables modeled on both the within and between level
# Highlight statistically significant result at alpha = 0.05
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")], sig = TRUE,
              cluster = Demo.twolevel$cluster)

# Example 3: Split output table in within-group and between-group correlation matrix.
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
               cluster = Demo.twolevel$cluster, split = TRUE)

# Example 4: Print correlation coefficients, standard errors, z test statistics,
# and p-values
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
               cluster = Demo.twolevel$cluster, print = "all")

# Example 5: Print correlation coefficients and p-values
# significance values with Bonferroni correction
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
               cluster = Demo.twolevel$cluster, print = c("cor", "p"),
               p.adj = "bonferroni")

#-------------------------------------------------------------------------------
# Example 6: Variables "y1", "y2", and "y2" modeled on both the within and between level
# Variables "w1" and "w2" modeled on the cluster level
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3", "w1", "w2")],
               cluster = Demo.twolevel$cluster,
               between = c("w1", "w2"))

# Example 7: Show variables specified in the argument 'between' first
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3", "w1", "w2")],
               cluster = Demo.twolevel$cluster,
               between = c("w1", "w2"), order = TRUE)

#-------------------------------------------------------------------------------
# Example 8: Variables "y1", "y2", and "y2" modeled only on the within level
# Variables "w1" and "w2" modeled on the cluster level
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3", "w1", "w2")],
               cluster = Demo.twolevel$cluster,
               within = c("y1", "y2", "y3"), between = c("w1", "w2"))

#-------------------------------------------------------------------------------
# Example 9: lavaan model and summary of the multilevel model used to compute the
# within-group and between-group correlation matrix

mod <- multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
                      cluster = Demo.twolevel$cluster, output = FALSE)

# lavaan model syntax
mod$model

# Fitted lavaan object
lavaan::summary(mod$model.fit, standardized = TRUE)

#----------------------------------------------------------------------------
# Write Results

# Example 10a: Write results into a text file
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
               cluster = Demo.twolevel$cluster,
               write = "Multilevel_Correlation.txt")

# Example 10b: Write results into an Excel file
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
               cluster = Demo.twolevel$cluster,
               write = "Multilevel_Correlation.xlsx")

result <- multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")],
                         cluster = Demo.twolevel$cluster, output = FALSE)
write.result(result, "Multilevel_Correlation.xlsx")

## End(Not run)

misty documentation built on Oct. 24, 2024, 5:10 p.m.

Related to multilevel.cor in misty...