# estPI: Calculate Probabilistic Index In gMWT: Generalized Mann-Whitney Type Tests

## Description

This function calculates the probabilistic indices \hat{P}_t , \hat{P}_{tt'} and \hat{P}_{tt't''} to compare the groups of observations.

## Usage

 1  estPI(X, g, type="pair", goi=NULL, mc=1, order=TRUE, alg="Cnaive") 

## Arguments

 X Matrix or vector with observations. The rows refer to individuals, the columns to variables. g Vector of group numbers for observations in X. Its length has to be the number of observations in X. type Type of probabilistic index, see details. goi Groups Of Interest, see details. mc Multiple Cores, set the amount of cores to use for parallel calculation (only available for Linux). order Boolean, calculate probabilistic index only for a specified order of groups or for all possible orders. alg Internal function, which algorithm should be used to calculate the probabilistic index.

## Details

The matrix X contains the data. Each column refers to a variable, each row to an observation. The group memberships of the observations are given in g. In the case of one dimensional data, X is a vector.

There is also an option to calculate the probabilistic indices (PI) only for a subset of the groups. In that case those group labels can be specified with the goi option by giving the corresponding group labels.

Different types of PI can be calculated: "single" calculates the probability \hat{P}_t for each group, "pair" produces the probabilistic indices \hat{P}_{tt'} for all pairs of groups t<t', and "triple" provides the probabilities \hat{P}_{tt't''} for all triples of groups t<t'<t''. See Fischer et al. (2013) for more details.

Note that the PIs are calculated using the group numbering as given in g. See also the function createGroups for renumbering the group labels. By specifying the option order=FALSE the PIs for all possible group orders will be calculated. The default is that the PI is only calculated for the natural order given by g.

In case this code is executed on a Linux machine and X is a data matrix, the calculation can be parallelized using the option mc to specify the amount of used calculation cores.

Different algorithms are available for the calculation of the PIs. The default is the fastest possible option and the user should not change this option. Different algorithms are only provided for validation and testing purposes. Options here are currently Cnaive, Rnaive, Rgrid, Rsub, Csub but not all combinations of type/alg are available.

## Value

A list with class 'estPI' containing the following components:

 probs Matrix or vector of the PIs. type String, storing the type of PI. goi Vector, the Groups Of Interest, as given in the function call. order Boolean, PI just for the specified order or for all orders. obs Matrix, the original data matrix. g Vector, the original group vector. alg String, the requested algorithm.

Daniel Fischer

## References

Fischer, D., Oja, H., Schleutker, Sen, P.K., J., Wahlfors, T. (2013): Generalized Mann-Whitney Type Tests for Microarray Experiments, Scandinavian Journal of Statistics, to appear.

Fischer, D., Oja, H. (2013): Mann-Whitney Type Tests for Microarray Experiments: The R Package gMWT, submitted article.

createGroups
 1 2 3 4 5 6 7 8 9 X <- c(sample(15)) g <- c(1,1,1,2,2,2,2,3,3,3,4,4,4,4,4) estPI(X,g,type="single") X <- matrix(c(rnorm(5000,1.5,2),rnorm(6000,2,2),rnorm(4000,3.5,1)),byrow=TRUE, ncol=10) colnames(X) <- letters[1:10] g <- c(rep(1,500),rep(2,600),rep(3,400)) estPI(X,g,type="single",mc=1)