Test whether the leading members of ordered lists significantly overlap.

1 | ```
cumOverlap(ol1, ol2)
ol1

`ol1` |
vector containing first ordered list. Duplicate values not allowed. |

ol2
ol2
vector containing second ordered list. Should contain the same values as found in

The function compares the top `n`

members of each list, for every possible `n`

, and conducts an hypergeometric test for overlap.
The function returns the value of n

giving the smallest p-value.

The p-values are adjusted for multiple testing in a similar way to Bonferroni's method, but starting from the top of th e ranked list instead of from the smallest p-values. This approach is designed to be sensitive to contexts where the number of Ids involved in the significant overlap are a small proportion of the total.

The vectors ol1

and ol2

do not need to be of the same length, but only values in common between the two vectors will be used in the calculation.

This method was described in Chapter 4 of Wu (2011).

List containing the following components:

n.total
n.total
integer, total number of values in common between

n.min
n.min
integer, top table length leading to smallest adjusted p-value.

p.min
p.min
smallest adjusted p-value.

n.overlap
n.overlap
integer, number of overlapping IDs in first

id.overlap
id.overlap
vector giving the overlapping IDs in first

p.value
p.value
numeric, vector of p-values for each possible top table length.

adj.p.value
adj.p.value
numeric, vector of Bonferroni adjusted p-values for each possible top table length.

Gordon Smyth and Di Wu

Wu, D (2011). Finding hidden relationships between gene expression profiles with application to breast cancer biology. PhD thesis, University of Melbourne. http://hdl.handle.net/11343/36278

1 2 3 4 5 6 | ```
GeneIds <- paste0("Gene",1:50)
ol1 <- GeneIds
ol2 <- c(sample(GeneIds[1:5]), sample(GeneIds[6:50]))
coa <- cumOverlap(ol1, ol2)
coa$p.min
coa$id.overlap
``` |

```
[1] 2.359871e-06
[1] "Gene1" "Gene2" "Gene3" "Gene4" "Gene5"
```

