umxACE: Build and run a 2-group Cholesky twin model (uni-variate or...

Description Usage Arguments Details Value References See Also Examples

View source: R/build_run_modify.R

Description

Implementing a core task in twin modeling, umxACE models the genetic and environmental structure of one or more phenotypes (measured variables) using the Cholesky ACE model (Neale and Cardon, 1996).

Classical twin modeling uses the genetic and environmental differences among pairs of mono-zygotic (MZ) and di-zygotic (DZ) twins reared together.

umxACE implements a 2-group model to capture these data and represent the phenotypic variance as a sum of Additive genetic, unique environmental (E) and, optionally, either common or shared-environment (C) or non-additive genetic effects (D).

The following figure shows how the ACE model appears as a path diagram (for one variable):

Figure: ACE univariate.png

umxACE allows multivariate analyses, and this brings us to the Cholesky part of the model.

This model creates as many latent A C and E variables as there are phenotypes, and, moving from left to right, decomposes the variance in each manifest into successively restricted factors. The following figure shows how the ACE model appears as a path diagram:

Figure: ACE matrix.png

In this model, the variance-covariance matrix of the raw data is recovered as the product of the lower Cholesky and its transform.

This Cholesky or lower-triangle decomposition allows a model which is both sure to be solvable, and also to account for all the variance (with some restrictions) in the data.

This figure also contains the key to understanding how to modify models that umxACE produces. read the "Matrices and Labels in ACE model" section in details below...

NOTE: Scroll down to details for how to use the function, a figure and multiple examples.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
umxACE(
  name = "ACE",
  selDVs,
  selCovs = NULL,
  dzData = NULL,
  mzData = NULL,
  sep = NULL,
  data = NULL,
  zyg = "zygosity",
  type = c("Auto", "FIML", "cov", "cor", "WLS", "DWLS", "ULS"),
  numObsDZ = NULL,
  numObsMZ = NULL,
  boundDiag = 0,
  allContinuousMethod = c("cumulants", "marginals"),
  autoRun = getOption("umx_auto_run"),
  intervals = FALSE,
  tryHard = c("no", "yes", "ordinal", "search"),
  optimizer = NULL,
  covMethod = c("fixed", "random"),
  dzAr = 0.5,
  dzCr = 1,
  weightVar = NULL,
  equateMeans = TRUE,
  addStd = TRUE,
  addCI = TRUE
)

Arguments

name

The name of the model (defaults to"ACE").

selDVs

The variables to include from the data: preferably, just "dep" not c("dep_T1", "dep_T2").

selCovs

(optional) covariates to include from the data (do not include sep in names)

dzData

The DZ dataframe.

mzData

The MZ dataframe.

sep

The separator in twin variable names, often "_T", e.g. "dep_T1". Simplifies selDVs.

data

If provided, dzData and mzData are treated as levels of zyg to select() MZ and DZ data sets (default = NULL)

zyg

If data provided, this column is used to select rows by zygosity (Default = "zygosity")

type

Analysis method one of c("Auto", "FIML", "cov", "cor", "WLS", "DWLS", "ULS")

numObsDZ

Number of DZ twins: Set this if you input covariance data.

numObsMZ

Number of MZ twins: Set this if you input covariance data.

boundDiag

Numeric lbound for diagonal of the a, c, and e matrices. Defaults to 0 since umx version 1.8

allContinuousMethod

"cumulants" or "marginals". Used in all-continuous WLS data to determine if a means model needed.

autoRun

Whether to run the model (default), or just to create it and return without running.

intervals

Whether to run mxCI confidence intervals (default = FALSE)

tryHard

Default ('no') uses normal mxRun. "yes" uses mxTryHard. Other options: "ordinal", "search"

optimizer

Optionally set the optimizer (default NULL does nothing).

covMethod

How to treat covariates: "fixed" (default) or "random".

dzAr

The DZ genetic correlation (defaults to .5, vary to examine assortative mating).

dzCr

The DZ "C" correlation (defaults to 1: set to .25 to make an ADE model).

weightVar

If provided, a vector objective will be used to weight the data. (default = NULL).

equateMeans

Whether to equate the means across twins (defaults to TRUE).

addStd

Whether to add the algebras to compute a std model (defaults to TRUE).

addCI

Whether to add intervals to compute CIs (defaults to TRUE).

Details

Data Input The function flexibly accepts raw data, and also summary covariance data (in which case the user must also supple numbers of observations for the two input data sets).

The type parameter can select how you want the model data treated. "FIML" is the normal treatment. "cov" and "cor" will turn raw data into cor data for analysis, or check that you've provided cor data as input.

Types "WLS", "DWLS", and "ULS" will process raw data into WLS data of these types.

The default, "Auto" will treat data as the type they are provided as.

Ordinal Data In an important capability, the model transparently handles ordinal (binary or multi-level ordered factor data) inputs, and can handle mixtures of continuous, binary, and ordinal data in any combination. An experimental feature is under development to allow Tobit modeling.

The function also supports weighting of individual data rows. In this case, the model is estimated for each row individually, then each row likelihood is multiplied by its weight, and these weighted likelihoods summed to form the model-likelihood, which is to be minimized. This feature is used in the non-linear GxE model functions.

Additional features The umxACE function supports varying the DZ genetic association (defaulting to .5) to allow exploring assortative mating effects, as well as varying the DZ “C” factor from 1 (the default for modeling family-level effects shared 100% by twins in a pair), to .25 to model dominance effects.

Matrices and Labels in ACE model

Matrices 'a', 'c', and 'e' contain the path loadings of the Cholesky ACE factor model.

So, labels relevant to modifying the model are of the form "a_r1c1", "c_r1c1" etc.

Variables are in rows, and factors are in columns. So to drop the influence of factor 2 on variable 3, you would say:

m2 = umxModify(m1, update = "c_r3c2")

Less commonly-modified matrices are the mean matrix expMean. This has 1 row, and the columns are laid out for each variable for twin 1, followed by each variable for twin 2.

So, in a model where the means for twin 1 and twin 2 had been equated (set = to T1), you could make them independent again with this script:

m1$top$expMean$labels[1, 4:6] = c("expMean_r1c4", "expMean_r1c5", "expMean_r1c6")

note: Only one of C or D may be estimated simultaneously. This restriction reflects the lack of degrees of freedom to simultaneously model C and D with only MZ and DZ twin pairs (Eaves et al. 1978, p267).

Value

References

See Also

Other Twin Modeling Functions: plot.MxModelTwinMaker(), power.ACE.test(), umxACE_cov_fixed(), umxACEcov(), umxACEv(), umxCP(), umxDoCp(), umxDoC(), umxGxE_window(), umxGxEbiv(), umxGxE(), umxIP(), umxRotate.MxModelCP(), umxSexLim(), umxSimplex(), umxTwinAddMeansModel(), umx

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
# ============================
# = How heritable is height? =
# ============================
require(umx)
data(twinData) # ?twinData from Australian twins.
# Pick the variables
# 1. Height has a tiny variance, and this makes solution finding very hard.
# We'll scale height up by 10x to make the Optimizer's task easier.
twinData[, c("ht1", "ht2")] = twinData[, c("ht1", "ht2")] * 10

# 2. umxACE picks the variables it needs from the data.
mzData = twinData[twinData$zygosity %in% "MZFF", ]
dzData = twinData[twinData$zygosity %in% "DZFF", ]

# 3. umxACE can figure out variable names using sep: 
#    e.g. selVars = "wt" + sep= "_T" -> "wt_T1" "wt_T2"
m1 = umxACE(selDVs = "ht", sep = "", dzData = dzData, mzData = mzData)

# tip: with report = "html", umxSummary can print the table to your browser!
umxSummary(m1, std = FALSE) # un-standardized
# tip: plot gives a figure of the model and parameters
# plot(m1)


# =============================================================
# = ADE: Evidence for dominance ? (DZ correlation set to .25) =
# =============================================================
m2 = umxACE(selDVs = "ht", sep = "", dzData = dzData, mzData = mzData, dzCr = .25)
umxCompare(m2, m1) # ADE is better
umxSummary(m2, comparison = m1) 
# nb: Although summary is smart enough to print d, the underlying 
#     matrices are still called a, c & e.

# ===================================================
# = WLS example using diagonal weight least squares =
# ===================================================
m3 = umxACE(selDVs = "ht", sep = "", dzData = dzData, mzData = mzData, 
	type = "DWLS", allContinuousMethod='marginals'
)

# ==============================
# = Univariate model of weight =
# ==============================

# Things to note:

# 1. Weight has a large variance, and this makes solution finding very hard.
# Here, we scale wt to make the Optimizer's task easier.

data(twinData)
tmp = umx_residualize(c("wt", "ht"), cov = "age", suffixes= c(1, 2), data = twinData)
mzData = tmp[tmp$zygosity %in% "MZFF", ]
dzData = tmp[tmp$zygosity %in% "DZFF", ]

# You might also want to explore re-scaling the variable
# tmp = twinData$wt1[!is.na(twinData$wt1)]
# car::powerTransform(tmp, family="bcPower"); hist(tmp^-0.6848438)
# twinData$wt1 = twinData$wt1^-0.6848438
# twinData$wt2 = twinData$wt2^-0.6848438

# 4. note: the default boundDiag = 0 lower-bounds a, c, and e at 0.
#    Prevents mirror-solutions. If not desired: set boundDiag = NULL.

m2 = umxACE(selDVs = "wt", dzData = dzData, mzData = mzData, sep = "", boundDiag = NULL)

# A short cut (which is even shorter for "_T" twin data with "MZ"/"DZ" data in zygosity column is:
m1 = umxACE(selDVs = "wt", sep = "", data = twinData,
	dzData = c("DZMM", "DZFF", "DZOS"), mzData = c("MZMM", "MZFF"))
# |   |   a1|c1 |   e1|
# |:--|----:|:--|----:|
# |wt | 0.93|.  | 0.38|

# MODEL MODIFICATION
# We can modify this model, e.g. test shared environment. 
# Set comparison to modify, and show effect in one step.

m2 = umxModify(m1, update = "c_r1c1", name = "no_C", comparison = TRUE)
# nb: You can see names of free parameters with parameters(m2)

# =========================================================
# = Well done! Now you can make modify twin models in umx =
# =========================================================

## Not run: 
# =====================================
# = Bivariate height and weight model =
# =====================================
data(twinData)
# We'll scale height (ht1 and ht2) and weight
twinData = umx_scale_wide_twin_data(data = twinData, varsToScale = c("ht", "wt"), sep = "")
mzData = twinData[twinData$zygosity %in% c("MZFF", "MZMM"),]
dzData = twinData[twinData$zygosity %in% c("DZFF", "DZMM", "DZOS"), ]
mzData = mzData[1:80,] # quicker run to keep CRAN happy
dzData = dzData[1:80,]
m1 = umxACE(selDVs = c("ht", "wt"), sep = '', dzData = dzData, mzData = mzData)
umxSummary(m1)

# ===================
# = Ordinal example =
# ===================
require(umx)
data(twinData)
twinData= umx_scale_wide_twin_data(data=twinData,varsToScale=c("wt"),sep="")
# Cut BMI column to form ordinal obesity variables
obLevels = c('normal', 'overweight', 'obese')
cuts = quantile(twinData[, "bmi1"], probs = c(.5, .2), na.rm = TRUE)
twinData$obese1=cut(twinData$bmi1, breaks=c(-Inf,cuts,Inf), labels=obLevels)
twinData$obese2=cut(twinData$bmi2, breaks=c(-Inf,cuts,Inf), labels=obLevels)
# Make the ordinal variables into umxFactors
ordDVs = c("obese1", "obese2")
twinData[, ordDVs] = umxFactor(twinData[, ordDVs])
mzData = twinData[twinData$zygosity %in% "MZFF", ]
dzData = twinData[twinData$zygosity %in% "DZFF", ]
mzData = mzData[1:80, ] # Just top 80 pairs to run fast
dzData = dzData[1:80, ]
str(mzData) # make sure mz, dz, and t1 and t2 have the same levels!

# Data-prep done - here's the model and summary!:
m1 = umxACE(selDVs = "obese", dzData = dzData, mzData = mzData, sep = '')
# umxSummary(m1)

# ============================================
# = Bivariate continuous and ordinal example =
# ============================================
data(twinData)
twinData= umx_scale_wide_twin_data(data=twinData,varsToScale="wt",sep= "")
# Cut BMI column to form ordinal obesity variables
obLevels   = c('normal', 'overweight', 'obese')
cuts       = quantile(twinData[, "bmi1"], probs = c(.5, .2), na.rm = TRUE)
twinData$obese1=cut(twinData$bmi1,breaks=c(-Inf,cuts,Inf),labels=obLevels)
twinData$obese2=cut(twinData$bmi2,breaks=c(-Inf,cuts,Inf),labels=obLevels)
# Make the ordinal variables into mxFactors
ordDVs = c("obese1", "obese2")
twinData[, ordDVs] = umxFactor(twinData[, ordDVs])
mzData = twinData[twinData$zygosity %in% "MZFF",] 
dzData = twinData[twinData$zygosity %in% "DZFF",]
mzData = mzData[1:80,] # just top 80 so example runs in a couple of secs
dzData = dzData[1:80,]
m1 = umxACE(selDVs= c("wt","obese"), dzData= dzData, mzData= mzData, sep='')

# =======================================
# = Mixed continuous and binary example =
# =======================================
require(umx)
data(twinData)
twinData= umx_scale_wide_twin_data(data= twinData,varsToScale= "wt", sep="")
# Cut to form category of 20% obese subjects
# and make into mxFactors (ensure ordered is TRUE, and require levels)
obLevels   = c('normal', 'obese')
cuts       = quantile(twinData[, "bmi1"], probs = .2, na.rm = TRUE)
twinData$obese1= cut(twinData$bmi1, breaks=c(-Inf,cuts,Inf), labels=obLevels) 
twinData$obese2= cut(twinData$bmi2, breaks=c(-Inf,cuts,Inf), labels=obLevels) 
ordDVs = c("obese1", "obese2")
twinData[, ordDVs] = umxFactor(twinData[, ordDVs])

selDVs = c("wt", "obese")
mzData = twinData[twinData$zygosity %in% "MZFF",]
dzData = twinData[twinData$zygosity %in% "DZFF",]
m1 = umxACE(selDVs = selDVs, dzData = dzData, mzData = mzData, sep = '')
umxSummary(m1)

# ===================================
# Example with covariance data only =
# ===================================

require(umx)
data(twinData)
twinData= umx_scale_wide_twin_data(data=twinData, varsToScale= "wt", sep="")
selDVs = c("wt1", "wt2")
mz = cov(twinData[twinData$zygosity %in%  "MZFF", selDVs], use = "complete")
dz = cov(twinData[twinData$zygosity %in%  "DZFF", selDVs], use = "complete")
m1 = umxACE(selDVs=selDVs, dzData=dz, mzData=mz, numObsDZ=569, numObsMZ=351)
umxSummary(m1)
plot(m1)

## End(Not run)

tbates/umx documentation built on March 30, 2020, 7:56 a.m.