View source: R/build_run_modify.R
umxThresholdMatrix | R Documentation |
High-level helper for ordinal modeling. Creates, labels, and sets smart-starts for this complex set set of an algebra and matrices. Big time saver!
umxThresholdMatrix(
df,
fullVarNames = NULL,
sep = NULL,
method = c("Mehta", "allFree"),
threshMatName = "threshMat",
l_u_bound = c(NA, NA),
droplevels = FALSE,
verbose = FALSE,
selDVs = "deprecated"
)
df |
The data being modeled (to allow access to the factor levels and quantiles within these for each variable) |
fullVarNames |
The variable names. Note for twin data, just the base names, which sep will be used to fill out. |
sep |
(e.g. "_T") Required for wide (twin) data. It is used to break the base names our from their numeric suffixes. |
method |
How to implement the thresholds: Mehta, (1 free thresh for binary, first two fixed for ordinal) or "allFree" |
threshMatName |
name of the matrix which is returned. Defaults to "threshMat" - best not to change it. |
l_u_bound |
c(NA, NA) by default, you can use this to bound the first (base) threshold. |
droplevels |
Whether to drop levels with no observed data (defaults to FALSE) |
verbose |
How much to say about what was done. (defaults to FALSE) |
selDVs |
deprecated. Use "fullVarNames" |
We often need to model ordinal data: sex, low-med-hi, depressed/normal, etc., A useful conceptual strategy to handle these data is to build a standard model for normally-varying data and then to threshold this normal distribution to generate the observed data. Thus an observation of "depressed" is modeled as a high score on the latent normally distributed trait, with thresholds set so that only scores above this threshold (1-minus the number of categories) reach the criteria for the diagnosis.
Making this work can require fixing the first 2 thresholds of ordinal data, or fixing both the mean and variance of a latent variable driving binary data, in order to estimate its one-free parameter: where to place the single threshold separating low from high cases.
The function returns a 3-item list consisting of:
A thresholdsAlgebra (named threshMatName
)
A matrix of deviations for the thresholds (deviations_for_thresh
)
A lower matrix of ones (lowerOnes_for_thresh
)
Twin Data
With twin data, make sure to provide the full names for twin data... this is not standard I know...
For twins (the function currently handles only pairs), the thresholds are equated for both twins using labels:
$labels
obese_T1 obese_T2
dev_1 "obese_dev1" "obese_dev1"
list of thresholds matrix, deviations, lowerOnes
OpenMx::mxThreshold()
Other Advanced Model Building Functions:
umx
,
umxAlgebra()
,
umxFixAll()
,
umxJiggle()
,
umxRun()
,
umxUnexplainedCausalNexus()
,
xmuLabel()
,
xmuValues()
# ============================
# = Simple non-twin examples =
# ============================
# data: 1 2-level ordered factor
x = data.frame(ordered(rbinom(100,1,.5))); names(x) = c("x")
tmp = umxThresholdMatrix(x, fullVarNames = "x")
# The lower ones matrix (all fixed)
tmp[[1]]$values
tmp[[1]]$free
# The deviations matrix
tmp[[2]]$values
tmp[[2]]$labels # note: for twins, labels will be equated across twins
# The algebra that adds the deviations to create thresholds:
tmp[[3]]$formula
# Example of a warning to not omit the variable names
# tmp = umxThresholdMatrix(x)
# Polite message: For coding safety, when calling umxThresholdMatrix, set fullVarNames...
# One ordered factor with 5-levels
x = cut(rnorm(100), breaks = c(-Inf,.2,.5, .7, Inf)); levels(x) = 1:5
x = data.frame(ordered(x)); names(x) <- c("x")
tmp = umxThresholdMatrix(x, fullVarNames = "x")
tmp[[2]]$name
tmp[[2]]$free # last one is free.. (method = Mehta)
tmp = umxThresholdMatrix(x, fullVarNames = "x", l_u_bound= c(-1,1))
tmp[[2]]$lbound # bounds applied to base threshold
# =================================
# = Binary example with twin data =
# =================================
# ===============================================================
# = Create a series of binary and ordinal columns to work with =
# ===============================================================
data(twinData)
# Make "obese" variable with ~20% subjects categorised as obese
obesityLevels = c('normal', 'obese')
cutPoints = quantile(twinData[, "bmi1"], probs = .2, na.rm = TRUE)
twinData$obese1 = cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels)
twinData$obese2 = cut(twinData$bmi2, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels)
# Step 2: Make the ordinal variables into umxFactors (ordered, with the levels found in the data)
selVars = c("obese1", "obese2")
twinData[, selVars] = umxFactor(twinData[, selVars])
# Example 1
# use verbose = TRUE to see informative messages
tmp = umxThresholdMatrix(twinData, fullVarNames = selVars, sep = "", verbose = TRUE)
# ======================================
# = Ordinal (n categories > 2) example =
# ======================================
# Repeat for three-level weight variable
obesityLevels = c('normal', 'overweight', 'obese')
cutPoints = quantile(twinData[, "bmi1"], probs = c(.4, .7), na.rm = TRUE)
twinData$obeseTri1 = cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels)
twinData$obeseTri2 = cut(twinData$bmi2, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels)
selDVs = "obeseTri"; selVars = tvars(selDVs, sep = "", suffixes = 1:2)
twinData[, selVars] = umxFactor(twinData[, selVars])
tmp = umxThresholdMatrix(twinData, fullVarNames = selVars, sep = "", verbose = TRUE)
# ========================================================
# = Mix of all three kinds example (and a 4-level trait) =
# ========================================================
obesityLevels = c('underWeight', 'normal', 'overweight', 'obese')
cutPoints = quantile(twinData[, "bmi1"], probs = c(.25, .4, .7), na.rm = TRUE)
twinData$obeseQuad1 = cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels)
twinData$obeseQuad2 = cut(twinData$bmi2, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels)
selVars = c("obeseQuad1", "obeseQuad2")
twinData[, selVars] = umxFactor(twinData[, selVars])
selDVs =c("bmi", "obese", "obeseTri", "obeseQuad")
tmp = umxThresholdMatrix(twinData, fullVarNames = tvars(selDVs, sep= ""), sep = "", verbose = TRUE)
# The lower ones matrix (all fixed)
tmp[[1]]$values
# The deviations matrix
tmp[[2]]$values
tmp[[2]]$labels # note labels are equated across twins
# Check to be sure twin-1 column labels same as twin-2
tmp[[2]]$labels[,2]==tmp[[2]]$labels[,4]
# The algebra that assembles these into thresholds:
tmp[[3]]$formula
# =================================
# = Example with method = allFree =
# =================================
tmp = umxThresholdMatrix(twinData, fullVarNames = tvars(selDVs, sep= ""), sep = "",
method = "allFree")
all(tmp[[2]]$free)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.