Modular optimal discovery procedure (mODP)

Share:

Description

kl_clust is an implementation of mODP that assigns genes to modules based on of the Kullback-Leibler distance.

Usage

1
2
3
4
5
6
7
kl_clust(object, de.fit = NULL, n.mods = 50)

## S4 method for signature 'deSet,missing'
kl_clust(object, de.fit = NULL, n.mods = 50)

## S4 method for signature 'deSet,deFit'
kl_clust(object, de.fit = NULL, n.mods = 50)

Arguments

object

S4 object: deSet.

de.fit

S4 object: deFit.

n.mods

integer: number of modules (i.e., clusters).

Details

mODP utilizes a k-means clustering algorithm where genes are assigned to a cluster based on the Kullback-Leiber distance. Each gene is assigned an module-average parameter to calculate the ODP score (See Woo, Leek and Storey 2010 for more details). The mODP and full ODP produce nearly exact results but mODP has the advantage of being computationally faster.

Value

A list with the following slots:

  • mu.full: cluster averaged fitted values from full model.

  • mu.null: cluster averaged fitted values from null model.

  • sig.full: cluster standard deviations from full model.

  • sig.null: cluster standard deviations from null model.

  • n.per.mod: total members in each cluster.

  • clustMembers: cluster membership for each gene.

Note

The results are generally insensitive to the number of modules after a certain threshold of about n.mods>=50 in our experience. It is recommended that users experiment with the number of modules. If the number of modules is equal to the number of genes then the original ODP is implemented.

Author(s)

John Storey, Jeffrey Leek

References

Storey JD. (2007) The optimal discovery procedure: A new approach to simultaneous significance testing. Journal of the Royal Statistical Society, Series B, 69: 347-368.

Storey JD, Dai JY, and Leek JT. (2007) The optimal discovery procedure for large-scale significance testing, with applications to comparative microarray experiments. Biostatistics, 8: 414-432.

Woo S, Leek JT, Storey JD (2010) A computationally efficient modular optimal discovery procedure. Bioinformatics, 27(4): 509-515.

See Also

odp, fit_models

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# import data
library(splines)
data(kidney)
age <- kidney$age
sex <- kidney$sex
kidexpr <- kidney$kidexpr
cov <- data.frame(sex = sex, age = age)

# create models
null_model <- ~sex
full_model <- ~sex + ns(age, df = 4)

# create deSet object from data
de_obj <- build_models(data = kidexpr, cov = cov, null.model = null_model,
full.model = full_model)

# mODP method
de_clust <- kl_clust(de_obj)

# change the number of clusters
de_clust <- kl_clust(de_obj, n.mods = 10)

# input a deFit object
de_fit <- fit_models(de_obj, stat.type = "odp")
de_clust <- kl_clust(de_obj, de.fit = de_fit)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.