corMLPE: Correlation structure for symmetric relational data
In nspope/corMLPE: A correlation structure for symmetric relational data

View source: R/corMLPE_class.R

corMLPE

R Documentation

Correlation structure for symmetric relational data

Description

Correlation structure for symmetric relational data

Usage

corMLPE(value = 0.1, form = ~1, fixed = FALSE)

Arguments

`value`	Starting value for the correlation parameter
`form`	A formula including two variables that give the unordered pair of elements associated with each observation, and optionally a grouping factor that indicates the set to which the elements belong. See 'Details'.
`fixed`	Optional. Logical, fit model with the starting value for the correlation parameter fixed

Details

"Maximum likelihood population effects" (MLPE) is a correlation structure for dyadic, symmetric relational data: where each observation is a measurement for an unordered pair of elements from a set. For two (different) elements $i,j$, let E[y_{i,j}] be the expectation of the response variable (perhaps conditional on some random effects), and

y_{i,j} = E[y_{i,j}] + α_{i} + α_{j} + ε_{i,j},

where α are associated with unique elements of the set and are i.i.d zero-mean Gaussian random variables with standard deviation τ; and ε are i.i.d Gaussian errors with standard deviation σ. Marginally (after integrating out α), the covariance between two observations y_{i,j} and y_{k,l} is

cov(y_{i,j}, y_{k,l}) = τ^2 (δ(i,k) + δ(j,l))

where the function δ evaluates to 1 when its arguments are equal and zero otherwise, and we order the indices so that i < j, k < l for convenience.

The marginal variance is var(y_{i,j}) = 2τ^2 + σ^2. The corresponding correlation structure has a single parameter, ρ = τ^2 / (2τ^2 + σ^2) which is constrained to lie between 0 and 0.5.

The "form" argument of a corMLPE object must contain two variables that indicate the pair of elements associated with each observation, and can optionally contain a grouping factor that indicates the set to which the elements belong. Elements from different sets are treated as distinct even if they have the same label, and thus there is always a zero correlation between measurements across different sets.

For example, if "data.frame(elem1 = c(1,1,2), elem2 = c(2,3,4), grp = c(1,1,2))" were used as data with "form=~elem1 + elem2 | grp", then the first two observations would be correlated (because they are from the same group and share the element "1"), but would both be uncorrelated with the third observation (as the third observation is associated with the second set, despite involving an element "2" that is labelled identically to an element from the first set). The ordering within a pair does not matter. Multiple observations of the same pair of elements are allowed, as are missing combinations of pairs, but "self" comparisons are not (where both elements of a pair are the same).

It is important to note that this correlation structure does not directly incorporate a (dis)similarity metric (which could instead be included as a covariate in the regression model), but instead tries to account for the dependence between pairwise measurements taken between the same objects.

References

Clarke et al. 2002. Confidence limits for regression relationships between distance matrices: estimating gene flow with distance. Journal of Agricultural, Biological, and Environmental Statistics 7: 361-372.