rowwisePermGroupBalanced: Permute rows in a group-balanced way

Description Usage Arguments Details Value Author(s) Examples

Description

#' Random permutation of the values on each row of the input data frame, with a preservation of the proportions between 2-group labeled columns: group of interest (GOI), and "others", respecively.

Each row thus contains exactly the same values as in the original expression matrix, but there should be no specific distinction between groups.

ATTENTION ! This procedure may give rise to surprizing bias. A priori it seemed to me that a balanced representation of the original groups between the permuted samples would be a good idea, because I sometimes observed that permutation tests with small sample sizes would return too many significant results since sometimes the resampled groups contain different proportions of the original groups. I thus implemented balanced permutation to suppress this effect.

However, I noticed the opposite effect: when the effect size is very strong in a given dataset, the balanced permuted set has an *under-representation* lof low p-values (e.g. 0 <= pval <= 30 surprizing behaviour is that the balanced permutations ensure equality of the resampled group means, but if the original groups have very different means, the resampled distributions are bimodal, and have thus a high variance. The consequence is to artificially reduce the denominator of the t statistics ($t_obs$).

Usage

1
rowwisePermGroupBalanced(x, g, goi = g[1])

Arguments

x

A matrix or data frame

g

Group labels

goi

Group of interest. If not specified, the first label is take as group of interest. In case g contains more than two distinct labels, these group labels are converted to "GOI", and "other", respectively.

Details

First version: 2015-03 Last modification: 2015-03

Value

A data frame of same dimensions as the input matrix/data frame, with row-wise group-balanced permuted values.

Author(s)

Jacques van Helden (Jacques.van-Helden@univ-amu.fr)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## Run example for rowwiseSample, in order to load 
## the data and parameters
example(rowwiseSample)

## Permute the values of denboer2009
balanced.perm.profiles <- rowwisePermGroupBalanced(
    x=denboer2009.expr[selected.samples],
    g=selected.labels,
    goi=group1)

## Run Welch test on the row-wise permuted values
balanced.perm.welch <- meanEqualityTests(
    balanced.perm.profiles, 
    g=selected.labels, goi=group1,
    selected.tests="welch")
                 
## Draw volcano plot of Welch test result with the permuted values, resp.
## NOTE: we already see that the negative control is "too good": 
## the highest significances are at -2 instead of 0.
meanEqualityTests.plotVolcano(balanced.perm.welch, 
   test="welch", 
   legend.corner='topright',
   main="Permuted Den Boer 2009, Welch volcano")

## Plot p-value distribution for the balanced row-wise permuted dataset
## NOTE: this plot clearly shows the bias of balanced permutation:
## the low p-values (<= 30%) are under-represented because when the 
## population means differe, the balanced resampling creates groups
## with same expected mean, but increaseed variance.
mulitpleTestingCorrections.plotPvalDistrib(
   balanced.perm.welch$welch.multicor, legend.corner="bottomright",
   col='#FFBBBB')

jvanheld/stats4bioinfo documentation built on May 20, 2019, 5:16 a.m.