corr_mat | R Documentation |
Returns the item correlation matrix (and when K dimensions have been specified, K matrices) that the function mdepriv
internally generates
when the argument method
= "bv"
and the argument bv_corr_type
= "mixed"
or "pearson"
or when the argument wa
is specified and the argument wb
= "mixed"
or "pearson"
.
Permits inspection of the correlation structure and can be used for complementary analytic purposes such as factor analysis or biplots.
corr_mat(
data,
items,
sampling_weights = NA,
corr_type = c("mixed", "pearson"),
output = c("numeric", "type", "both")
)
data |
a |
items |
a character string vector or list of such vectors specifying the indicators / items within the argument |
sampling_weights |
a character string corresponding to column heading of a numeric variable within the argument |
corr_type |
a character string selecting the correlation type.
Available choices are |
output |
a character string vector selecting the output. Available choices are |
The calculation of the correlation coefficient for a pair of items is based on the function weightedCorr
.
When setting the argument corr_type
to "mixed"
the appropriate correlation type "pearson"
, "polyserial"
or "polychoric"
is automatically detected for each pair of items by the following rules:
"pearson"
: both items have > 10 distinct values.
"polyserial"
: one item has \le
10, the other > 10 distinct values.
"polychoric"
: both items have \le
10 distinct values.
When the argument corr_type
is set to "pearson"
this correlation type is forced on all item pairs.
Depending on the correlation type(s) used, the matrix may not be positive semidefinite and therefore not immediately suitable for purposes such as factor analysis.
This is more likely to happen when some of the items are binary.
The function nearPD
will produce the nearest positive definite matrix.
Either a single matrix
or a list
composed of several matrix
es (s. argument "output"
).
head(simul_data, 3) # data used for demonstration
corr_mat(simul_data, c("y1", "y4", "y5", "y6")) # default output: numeric
corr_mat(simul_data, c("y1", "y4", "y5", "y6"), output = "type")
corr_mat(simul_data, c("y1", "y4", "y5", "y6"), output = "both")
# with sampling weights (3rd argument)
corr_mat(simul_data, c("y1", "y4", "y5", "y6"), "sampl_weights")
# choose correlation type
corr_mat_default <- corr_mat(simul_data, c("y1", "y4", "y5", "y6"))
corr_mat_mixed <- corr_mat(simul_data, c("y1", "y4", "y5", "y6"), corr_type = "mixed")
all.equal(corr_mat_default, corr_mat_mixed) # "mixed is corr_type's default
# force a correlation type on all pairs of items
corr_mat(simul_data, c("y1", "y4", "y5", "y6"), corr_type = "pearson")
# grouping items in dimensions
corr_mat(simul_data, list(c("y1", "y4", "y5", "y6"), c("y2", "y3", "y7")))
# customized group / dimension labels
corr_mat(simul_data, list("Group A" = c("y1", "y4", "y5", "y6"),
"Group B" = c("y2", "y3", "y7")))
# mdepriv output / returns as template for corr_mat arguments
# items grouped as dimensions
dim <- list("Group X" = c("y1", "y4", "y5", "y6"), "Group Z" = c("y2", "y3", "y7"))
# model: betti-verma ("bv"): correlation type = pearson, rhoH = NA (data driven)
bv_pearson <- mdepriv(simul_data, dim, "sampl_weights",
method = "bv", bv_corr_type = "pearson", output = "all")
# use model output as arguments
corr_mat(bv_pearson$data, bv_pearson$items, bv_pearson$sampling_weights,
corr_type = bv_pearson$wb, output = "both")
# model: user defined double weighting
# 1st factor = wa = "equal", 2nd factor = wa = "mixed" (correlation type),
# rhoH = NA (data driven)
eq_mixed <- mdepriv(simul_data, dim, "sampl_weights",
wa = "equal", wb = "mixed", output = "all")
# use model output as arguments
corr_mat(eq_mixed$data, eq_mixed$items, eq_mixed$sampling_weights,
corr_type = eq_mixed$wb, output = "both")
# model: user defined double weighting
# 1st factor = wa = "bv", 2nd factator = wb = "diagonal"
# (all off-diagonal correlations = 0), rhoH = NA (irrelvant)
bv_diagonal <- mdepriv(simul_data, dim, "sampl_weights",
wa = "bv", wb = "diagonal", output = "all")
# use model output as arguments
try(
corr_mat(bv_diagonal$data, bv_diagonal$items, bv_diagonal$sampling_weights,
corr_type = bv_diagonal$wb, output = "both")
)
# triggers an error because:
bv_diagonal$wb
# if corr_type is left as the default or set to a valid option, then ...
corr_mat(bv_diagonal$data, bv_diagonal$items, bv_diagonal$sampling_weights)
# ... it works
# for the arguments data, items and sampling_weights the ...
# ... corresponding mdepriv outputs are always valid
# plot unique correlation values and their relation to rhoH
items_sel <- c("y1", "y4", "y5", "y6", "y2", "y3", "y7") # a selection of items
corr_mat <- corr_mat(simul_data, items_sel) # corr_type default: "mixed"
rhoH <- mdepriv(simul_data, items_sel, method = "bv", output = "rhoH")
# bv_corr_type default: "mixed"
corr_val <- unique(sort(corr_mat)) # sorted unique correlation values
dist_rhoH_corr <- abs(corr_val - rhoH) # distance of corr. values to rhoH
bounding_rhoH <- abs(dist_rhoH_corr - min(dist_rhoH_corr)) < 10^-10
# TRUE if one of the two corr. values bounding rhoH else FALSE
corr_val_col <- ifelse(bounding_rhoH, "black", "gray") # colors for corr. values
barplot(corr_val,
col = corr_val_col,
border = NA,
ylim = c(-0.2, 1),
ylab = "correlation value [-1,+1]",
main = "sorted unique correlation values and rhoH"
)
abline(h = rhoH, col = "red", lwd = 1.5)
text(0, rhoH + 0.05, paste0("rhoH = ", round(rhoH, 4)), adj = 0, col = "red")
legend("left",
"correlation values\nbounding largest gap",
col = "black", pch = 15, pt.cex = 2, bty = "n"
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.