# GlobalAncova: Global test for differential gene expression In Bioconductor-mirror/GlobalAncova: Calculates a global test for differential gene expression between groups

## Description

Computation of a F-test for the association between expression values and clinical entities. In many cases a two way layout with gene and a dichotomous group as factors will be considered. However, adjustment for other covariates and the analysis of arbitrary clinical variables, interactions, gene co-expression, time series data and so on is also possible. The test is carried out by comparison of corresponding linear models via the extra sum of squares principle. Corresponding p-values, permutation p-values and/or asymptotic p-values are given.

There are three possible ways of using `GlobalAncova`. The general way is to define formulas for the full and reduced model, respectively, where the formula terms correspond to variables in `model.dat`. An alternative is to specify the full model and the name of the model terms that shall be tested regarding differential expression. In order to make this layout compatible with the function call in the first version of the package there is also a method where simply a group variable (and possibly covariate information) has to be given. This is maybe the easiest usage in cases where no 'special' effects like e.g. interactions are of interest.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12``` ```## S4 method for signature 'matrix,formula,formula,ANY,missing,missing,missing' GlobalAncova(xx, formula.full, formula.red, model.dat, test.genes, method = c("permutation","approx","both","Fstat"), perm = 10000, max.group.size = 2500, eps = 1e-16, acc = 50) ## S4 method for signature ## 'matrix,formula,missing,ANY,missing,missing,character' GlobalAncova(xx, formula.full, model.dat,test.terms, test.genes, method = c("permutation","approx","both","Fstat"), perm = 10000, max.group.size = 2500, eps = 1e-16, acc = 50) ## S4 method for signature 'matrix,missing,missing,missing,ANY,ANY,missing' GlobalAncova(xx, group, covars = NULL, test.genes, method = c("permutation","approx","both","Fstat"), perm = 10000, max.group.size = 2500, eps = 1e-16, acc = 50) ```

## Arguments

 `xx` Matrix of gene expression data, where columns correspond to samples and rows to genes. The data should be properly normalized beforehand (and log- or otherwise transformed). Missing values are not allowed. Gene and sample names can be included as the row and column names of `xx`. `formula.full` Model formula for the full model. `formula.red` Model formula for the reduced model (that does not contain the terms of interest.) `model.dat` Data frame that contains all the variable information for each sample. `group` Vector with the group membership information. `covars` Vector or matrix which contains the covariate information for each sample. `test.terms` Character vector that contains names of the terms of interest. `test.genes` Vector of gene names or a list where each element is a vector of gene names. `method` p-values can be calculated permutation-based (`"permutation"`) or by means of an approximation for a mixture of chi-square distributions (`"approx"`). Both p-values are provided when specifying `method = "both"`. With option `"Fstat"` only the global F-statistics are returned without p-values or further information. `perm` Number of permutations to be used for the permutation approach. The default is 10,000. `max.group.size` Maximum size of a gene set for which the asymptotic p-value is calculated. For bigger gene sets the permutation approach is used. `eps` Resolution of the asymptotic p-value. `acc` Accuracy parameter needed for the approximation. Higher values indicate higher accuracy.

## Value

If `test.genes = NULL` a list with components

 `effect` Name(s) of the tested effect(s) `ANOVA` ANOVA table `test.result` F-value, theoretical p-value, permutation-based and/or asymptotic p-value `terms` Names of all model terms

If a collection of gene sets is provided in `test.genes` a matrix is returned whose columns show the number of genes, value of the F-statistic, theoretical p-value, permutation-based and/or asymptotic p-value for each of the gene sets.

## Methods

xx = "matrix", formula.full = "formula", formula.red = "formula", model.dat = "ANY", group = "missing", covars = "missing", test.terms = "missing"

In this method, besides the expression matrix `xx`, model formulas for the full and reduced model and a data frame `model.dat` specifying corresponding model terms have to be given. Terms that are included in the full but not in the reduced model are those whose association with differential expression will be tested. The arguments `group`, `covars` and `test.terms` are '"missing"' since they are not needed for this method.

xx = "matrix", formula.full = "formula", formula.red = "missing", model.dat = "ANY", group = "missing", covars = "missing", test.terms = "character"

In this method, besides the expression matrix `xx`, a model formula for the full model and a data frame `model.dat` specifying corresponding model terms are required. The character argument `test.terms` names the terms of interest whose association with differential expression will be tested. The basic idea behind this method is that one can select single terms, possibly from the list of terms provided by previous `GlobalAncova` output, and test them without having to specify each time a model formula for the reduced model. The arguments `formula.red`, `group` and `covars` are '"missing"' since they are not needed for this method.

xx = "matrix", formula.full = "missing", formula.red = "missing", model.dat = "missing", group = "ANY", covars = "ANY", test.terms = "missing"

Besides the expression matrix `xx` a clinical variable `group` is required. Covariate adjustment is possible via the argument `covars` but more complex models have to be specified with the methods described above. This method emulates the function call in the first version of the package. The arguments `formula.full`, `formula.red`, `model.dat` and `test.terms` are '"missing"' since they are not needed for this method.

## Note

This work was supported by the NGFN project 01 GR 0459, BMBF, Germany.

## Author(s)

Reinhard Meister [email protected]
Ulrich Mansmann [email protected]
Manuela Hummel [email protected]
with contributions from Sven Knueppel

## References

Mansmann, U. and Meister, R., 2005, Testing differential gene expression in functional groups, Methods Inf Med 44 (3).

`Plot.genes`, `Plot.subjects`, `GlobalAncova.closed`, `GAGO`, `GlobalAncova.decomp`
 ```1 2 3 4 5 6 7``` ```data(vantVeer) data(phenodata) data(pathways) GlobalAncova(xx = vantVeer, formula.full = ~metastases + ERstatus, formula.red = ~ERstatus, model.dat = phenodata, test.genes=pathways[1], method="both", perm = 100) GlobalAncova(xx = vantVeer, formula.full = ~metastases + ERstatus, test.terms = "metastases", model.dat = phenodata, test.genes=pathways[1], method="both", perm = 100) GlobalAncova(xx = vantVeer, group = phenodata\$metastases, covars = phenodata\$ERstatus, test.genes=pathways[1], method="both", perm = 100) ```