dissmfacw: Multi-factor ANOVA from a dissimilarity matrix

View source: R/dissmfacw.R

dissmfacwR Documentation

Multi-factor ANOVA from a dissimilarity matrix

Description

Perform a multi-factor analysis of variance from a dissimilarity matrix.

Usage

dissmfacw(formula, data, R = 1000, gower = FALSE, squared = FALSE,
    weights = NULL)
    
gower_matrix(diss, squared=TRUE, weights=NULL)

## S3 method for class 'dissmultifactor'
print(x, pvalue.confint=0.95, digits = NULL, ...)

Arguments

formula

A regression-like formula. The left hand side term should be a dissimilarity matrix or a dist object.

data

A data frame from which the variables in formula should be taken.

R

Number of permutations used to assess significance.

gower

Logical: Is the dissimilarity matrix already a Gower matrix?

squared

Logical: Should we square the provided dissimilarities?

weights

Optional numerical vector of case weights.

diss

Dissimilarity matrix

x

a dissmultifactor object as returned by dissmfacw

pvalue.confint

Real in range [0,1]. Confidence probability.

digits

Integer or NULL. Number of digits.

...

Other generic print arguments.

Details

Function dissmfacw is, in some way, a generalization of dissassoc to account for several explanatory variables. The function computes the part of discrepancy explained by the list of covariates specified in the formula. It provides for each covariate the Type-II effect, i.e. the effect measured when removing the covariate from the full model with all variables included.

(The returned F values may slightly differ from those obtained with TraMineR versions older than 1.8-9. Since 1.8-9, the within sum of squares at the denominator is divided by n-m instead of n-m-1, where n is the sample size and m the total number of predictors and/or contrasts used to represent categorical factors.)

For a single factor dissmfacw is slower than dissassoc. Moreover, the latter performs also tests for homogeneity in within-group discrepancies (equality of variances) with a generalization of Levene's and Bartlett's statistics.

Part of the function is based on the Multivariate Matrix Regression with qr decomposition algorithm written in SciPy-Python by Ondrej Libiger and Matt Zapala (See Zapala and Schork, 2006, for a full reference.) The algorithm has been adapted for Type-II effects and extended to account for case weights.

Function gower_matrix transforms the provided dissimilarity matrix into a Gower matrix.

Value

A dissmultifactor object with the following components:

mfac

The part of variance explained by each variable (comparing full model to model without the specified variable) and its significance using permutation test

call

Function call

perms

Permutation values as a boot object

Author(s)

Matthias Studer (with Gilbert Ritschard for the help page)

References

Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2011). Discrepancy analysis of state sequences, Sociological Methods and Research, Vol. 40(3), 471-510, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1177/0049124111415372")}.

Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2010) Discrepancy analysis of complex objects using dissimilarities. In F. Guillet, G. Ritschard, D. A. Zighed and H. Briand (Eds.), Advances in Knowledge Discovery and Management, Studies in Computational Intelligence, Volume 292, pp. 3-19. Berlin: Springer.

Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2009). Analyse de dissimilarités par arbre d'induction. In EGC 2009, Revue des Nouvelles Technologies de l'Information, Vol. E-15, pp. 7-18.

Anderson, M. J. (2001). A new method for non-parametric multivariate analysis of variance. Austral Ecology 26, 32-46.

McArdle, B. H. and M. J. Anderson (2001). Fitting multivariate models to community data: A comment on distance-based redundancy analysis. Ecology 82(1), 290-297.

Zapala, M. A. and N. J. Schork (2006). Multivariate regression analysis of distance matrices for testing associations between gene expression patterns and related variables. Proceedings of the National Academy of Sciences of the United States of America 103(51), 19430-19435.

See Also

dissvar to compute a pseudo variance from dissimilarities and for a basic introduction to concepts of discrepancy analysis.
dissassoc to test association between objects represented by their dissimilarities and a covariate.
disstree for an induction tree analysis of objects characterized by a dissimilarity matrix.
disscenter to compute the distance of each object to its group center from pairwise dissimilarities.

Examples

## Define the state sequence object
data(mvad)
mvad.seq <- seqdef(mvad[, 17:86])
## Here, we use only first 100 sequences
mvad.seq <- mvad.seq[1:100,]

## Compute dissimilarities (any dissimilarity measure can be used)
mvad.ham <- seqdist(mvad.seq, method="HAM")

## And now the multi-factor analysis
print(dissmfacw(mvad.ham ~ male + Grammar + funemp +
	gcse5eq + fmpr + livboth, data=mvad[1:100,], R=10))

TraMineR documentation built on May 29, 2024, 5 a.m.