parMIEstimate: Parallel Mutual Information Estimation

Description Usage Arguments Value Author(s) References See Also Examples

Description

A function that computes the mutual information between all pairs of rows (or specified ones) of matrix counts using 10 different estimation methods.

Usage

1
2
3
4
5
parMIEstimate(counts,
              method = c("ML", "MM", "Bayes", "CS", "Shrink", "KD", "KNN"),
              unit = c("bit", "ban", "nat"), nchips,
              priorHyperParam = c("Jeffreys", "BLUnif", "Perks", "MiniMax"),
              shrinkageTarget, k = 3, tfList = NULL, boot = F)

Arguments

counts

a numeric matrix (for the reconstruction of gene regulatory networks, genes on rows and samples on columns).

method

a character string indicating which estimate is to be computed. One of "ML" (Maximum Likelihood Estimator, default), "MM" (Miller-Madow corrected Estimator), "Bayes" (Bayesian Estimators), "CS" (Chao-Shen Estimator), "Shrink" (James-Stein shrinkage Estimator), "KD" (kernel Density Estimator), or "KNN" (k-Nearest Neighbor Estimator), can be abbreviated. For the "Bayes" estimate it is needed to specify also which priorHyperParam is to be used; for "Shrink" is optional to specify values for the shrinkageTarget parameter; for "KNN" is needed to specify also the number of nearest neighbors k.

unit

the unit in which mutual information is measured. One of "bit" (log2, default), "ban" (log10) or "nat" (natural units).

nchips

the number of cpu's to be used for making the parallel calculation.

priorHyperParam

the prior distribution type for the Bayes estimation. One of "Jeffreys" (default, Jeffreys Prior, Krichevsky and Trofimov Estimator), "BLUnif" (Bayes-Laplace uniform Prior, Holste Estimator), "Perks" (Perks Prior, Schurmann and Grassberger Estimator), or "MiniMax" (MiniMax Prior), can be abbreviated.

shrinkageTarget

shrinkage target frequencies. If not specified (default) it is estimated in a James-Stein-type fashion (uniform distribution).

k

the number of nearest neighbors to consider for the estimate.

tfList

the character vector specifying which genes from the rownames of the counts matrix is to be used as transcription factor for network reconstruction.

boot

logical (FALSE as default). Used for calculating a null distribution in order to evaluate if such a interaction is true or obtained by chance.

Value

The parMIEstimate function returns a square matrix of dimension equal to the number of rows (number of genes) of the counts matrix, or a number of rows equal to the length of tfList.

Author(s)

Luciano Garofano lucianogarofano88@gmail.com, Stefano Maria Pagnotta, Michele Ceccarelli

References

Paniski L. (2003). Estimation of Entropy and Mutual Information. Neural Computation, vol. 15 no. 6 pp. 1191-1253.

Meyer P.E., Laffitte F., Bontempi G. (2008). minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information. BMC Bioinformatics 9:461.

Antos A., Kontoyiannis I. (2001). Convergence properties of functional estimates for discrete distributions. Random Structures and Algorithms, vol. 19 pp. 163-193.

Strong S., Koberle R., de Ruyter van Steveninck R.R., Bialek W. (1998). Entropy and Information in Neural Spike Trains. Physical Review Letters, vol. 80 pp. 197-202.

Miller G.A. (1955). Note on the bias of information estimates. Information Theory in Psychology, II-B pp. 95-100.

Jeffreys H. (1946). An invariant form for the prior probability in estimation problems. Proceedings of the Royal Society of London, vol. 186 no. 1007 pp. 453-461.

Krichevsky R.E., Trofimov V.K. (1981). The performance of universal encoding. IEEE Transactions on Information Theory, vol. 27 pp. 199-207.

Holste D., Hertzel H. (1998). Bayes' estimators of generalized entropies. Journal of Physics A, vol. 31 pp. 2551-2566.

Perks W. (1947). Some observations on inverse probability including a new indifference rule. Journal of the Institute of Actuaries, vol. 73 pp. 285-334.

Schurmann T., Grassberg P. (1996). Entropy estimation of symbol sequences. Chaos, vol. 6 pp. 414-427.

Trybula S. (1958). Some problems of simultaneous minimax estimation. The Annals of Mathematical Statistics, vol. 29 pp. 245-253.

Chao A., Shen T.J. (2003). Nonparametric estimation of Shannon's index diversity when there are unseen species. Environmental and Ecological Statistics, vol. 10 pp. 429-443.

James W., Stein C. (1961). Estimation with Quadratic Loss. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 pp. 361-379.

Moon Y., Rajagopalan B., Lall U. (1995). Estimation of mutual information using kernel density estimators. Physical Review E, vol. 52 n. 3 pp. 2318-2321.

Kraskov A., Stogbauer H., Grassberger P. (2004.) Estimating mutual information. Physical Review E, vol 69.

Sales G., Romualdi C. (2011). parmigene - a parallel R package for mutual information estimation and gene network reconstruction. Bioinformatics.

See Also

parEntropyEstimate

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
simData <- simulatedData(p = 5, n = 10, mu = 100, sigma = 0.25,
                        ppower = 0.73, noise = FALSE)
counts <- simData$counts
adjMat <- simData$adjMat

miML <- parMIEstimate(counts, method = "ML", unit = "nat", nchips = 2)
miBJ <- parMIEstimate(counts, method = "Bayes", unit = "nat",
                      nchips = 2, priorHyperParam = "Jeffreys")
miSH <- parMIEstimate(counts, method = "Shrink", unit = "nat",
                      nchips = 2)
miKD <- parMIEstimate(counts, method = "KD", nchips = 2)
miKNN <- parMIEstimate(counts, method = "KNN", unit = "nat", k = 3,
                      nchips = 2)

lucgar/synRNASeqNet documentation built on May 21, 2019, 8:54 a.m.