estPI: Calculate Probabilistic Index

View source: R/estPI.R

estPIR Documentation

Calculate Probabilistic Index


This function calculates the probabilistic indices \hat{P}_t , \hat{P}_{tt'} and \hat{P}_{tt't''} to compare the groups of observations.


  estPI(X, g, type="pair", goi=NULL, mc=1, order=TRUE, alg="Cnaive")



Matrix or vector with observations. The rows refer to individuals, the columns to variables.


Vector of group numbers for observations in X. Its length has to be the number of observations in X.


Type of probabilistic index, see details.


Groups Of Interest, see details.


Multiple Cores, set the amount of cores to use for parallel calculation (only available for Linux).


Boolean, calculate probabilistic index only for a specified order of groups or for all possible orders.


Internal function, which algorithm should be used to calculate the probabilistic index.


The matrix X contains the data. Each column refers to a variable, each row to an observation. The group memberships of the observations are given in g. In the case of one dimensional data, X is a vector.

There is also an option to calculate the probabilistic indices (PI) only for a subset of the groups. In that case those group labels can be specified with the goi option by giving the corresponding group labels.

Different types of PI can be calculated: "single" calculates the probability \hat{P}_t for each group, "pair" produces the probabilistic indices \hat{P}_{tt'} for all pairs of groups t<t', and "triple" provides the probabilities \hat{P}_{tt't''} for all triples of groups t<t'<t''. See Fischer et al. (2013) for more details.

Note that the PIs are calculated using the group numbering as given in g. See also the function createGroups for renumbering the group labels. By specifying the option order=FALSE the PIs for all possible group orders will be calculated. The default is that the PI is only calculated for the natural order given by g.

In case this code is executed on a Linux machine and X is a data matrix, the calculation can be parallelized using the option mc to specify the amount of used calculation cores.

Different algorithms are available for the calculation of the PIs. The default is the fastest possible option and the user should not change this option. Different algorithms are only provided for validation and testing purposes. Options here are currently Cnaive, Rnaive, Rgrid, Rsub, Csub but not all combinations of type/alg are available.


A list with class 'estPI' containing the following components:


Matrix or vector of the PIs.


String, storing the type of PI.


Vector, the Groups Of Interest, as given in the function call.


Boolean, PI just for the specified order or for all orders.


Matrix, the original data matrix.


Vector, the original group vector.


String, the requested algorithm.


Daniel Fischer


Fischer, D., Oja, H., Schleutker, Sen, P.K., J., Wahlfors, T. (2013): Generalized Mann-Whitney Type Tests for Microarray Experiments, Scandinavian Journal of Statistics, to appear.

Fischer, D., Oja, H. (2013): Mann-Whitney Type Tests for Microarray Experiments: The R Package gMWT, submitted article.

See Also



X <- c(sample(15))
g <- c(1,1,1,2,2,2,2,3,3,3,4,4,4,4,4)

X <- matrix(c(rnorm(5000,1.5,2),rnorm(6000,2,2),rnorm(4000,3.5,1)),byrow=TRUE, ncol=10)
colnames(X) <- letters[1:10]
g <- c(rep(1,500),rep(2,600),rep(3,400))


gMWT documentation built on Oct. 29, 2022, 1:14 a.m.