mvlm: Conduct multivariate multiple regression and MANOVA with...

Description Usage Arguments Details Value Author(s) References Examples

Description

mvlm is used to fit linear models with a multivariate outcome. It uses the asymptotic null distribution of the multivariate linear model test statistic to compute p-values (McArtor et al., under review). It therefore alleviates the need to use approximate p-values based Wilks' Lambda, Pillai's Trace, the Hotelling-Lawley Trace, and Roy's Greatest Root.

Usage

1
2
mvlm(formula, data, n.cores = 1, start.acc = 1e-20,
  contr.factor = "contr.sum", contr.ordered = "contr.poly")

Arguments

formula

An object of class formula where the outcome (e.g. the Y in the following formula: Y ~ x1 + x2) is a n x q matrix, where q is the number of outcome variables being regressed onto the set of predictors included in the formula.

data

Mandatory data.frame containing all of the predictors passed to formula.

n.cores

Number of cores to use in parallelization through the parallel pacakge.

start.acc

Starting accuracy of the Davies (1980) algorithm implemented in the davies function in the CompQuadForm package (Duchesne & De Micheaux, 2010) that mvlm uses to compute multivariate linear model p-values.

contr.factor

The type of contrasts used to test unordered categorical variables that have type factor. Must be a string taking one of the following values: ("contr.sum", "contr.treatment", "contr.helmert").

contr.ordered

The type of contrasts used to test ordered categorical variables that have type ordered. Must be a string taking one of the following values: ("contr.poly", "contr.sum", "contr.treatment", "contr.helmert").

Details

Importantly, the outcome of formula must be a matrix, and the object passed to data must be a data frame containing all of the variables that are named as predictors in formula.

The conditional effects of variables of type factor or ordered in data are computed based on the type of contrasts specified by contr.factor and contr.ordered. If data contains an (ordered or unordered) factor with k levels, a k-1 degree of freedom test will be conducted corresponding to that factor and the specified contrast structure. If, instead, the user wants to assess k-1 separate single DF tests that comprise this omnibus effect (similar to the approach taken by lm), then the appropriate model matrix should be formed in advance and passed to mvlm directly in the data parameter. See the package vigentte for an example by calling vignette('mvlm-vignette').

Value

An object with nine elements and a summary function. Calling summary(mvlm.res) produces a data frame comprised of:

Statistic

Value of the corresponding test statistic.

Numer DF

Numerator degrees of freedom for each test statistic.

Pseudo R2

Size of the corresponding (omnibus or conditional) effect on the multivariate outcome. Note that the intercept term does not have an estimated effect size.

p-value

The p-value for each (omnibus or conditional) effect.

In addition to the information in the three columns comprising summary(mvlm.res), the mvlm.res object also contains:

p.prec

A data.frame reporting the precision of each p-value. These are the maximum error bound of the p-values reported by the davies function in CompQuadForm.

y.rsq

A matrix containing in its first row the overall variance explained by the model for variable comprising Y (columns). The remaining rows list the variance of each outcome that is explained by the conditional effect of each predictor.

beta.hat

Estimated regression coefficients.

adj.n

Adjusted sample size used to determine whether or not the asmptotic properties of the model are likely to hold. See McArtor et al. (under review) for more detail.

data

Original input data and the model.matrix used to fit the model.

formula

The formula passed to mvlm.

Note that the printed output of summary(res) will truncate p-values to the smallest trustworthy values, but the object returned by summary(mvlm.res) will contain the p-values as computed. If the error bound of the Davies algorithm is larger than the p-value, the only conclusion that can be drawn with certainty is that the p-value is smaller than (or equal to) the error bound.

Author(s)

Daniel B. McArtor (dmcartor@nd.edu) [aut, cre]

References

Davies, R. B. (1980). The Distribution of a Linear Combination of chi-square Random Variables. Journal of the Royal Statistical Society. Series C (Applied Statistics), 29(3), 323-333.

Duchesne, P., & De Micheaux, P.L. (2010). Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods. Computational Statistics and Data Analysis, 54(4), 858-862.

McArtor, D. B., Grasman, R. P. P. P., Lubke, G. H., & Bergeman, C. S. (under review). A new approach to conducting linear model hypothesis tests with a multivariate outcome.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
data(mvlmdata)

Y <- as.matrix(Y.mvlm)

# Main effects model
mvlm.res <- mvlm(Y ~ Cont + Cat + Ord, data = X.mvlm)
summary(mvlm.res)

# Include two-way interactions
mvlm.res.int <- mvlm(Y ~ .^2, data = X.mvlm)
summary(mvlm.res.int)

dmcartor/MVLM documentation built on May 15, 2019, 9:19 a.m.