README.md

minerva

R package for Maximal Information-Based Nonparametric Exploration computation

Install

install.packages("minerva")
devtools::install_github('filosi/minerva')

Usage

library(minerva)

x <- 0:200 / 200
y <- sin(10 * pi * x) + x
mine(x,y, n.cores=1)
x <- 0:200 / 200
y <- sin(10 * pi * x) + x
mine_stat(x, y, measure="mic")
x <- 0:200 / 200
y <- sin(10 * pi * x) + x

r2 <- cor(x, y)
mm <- mine_stat(x, y, measure="mic")
mm - r2**2

## mine(x, y, n.cores=1)[[5]]

Compute statistic on matrices

x <- matrix(rnorm(1000), ncol=10, nrow=10)
y <- as.matrix(rnorm(1000), ncol=10, nrow=20)

## Compare feature of the same matrix
pstats(x)

## Compare features of matrix x with feature in matrix y
cstats(x, y)

Mictools pipeline

This is inspired to the original implementation by Albanese et al. available in python here: https://github.com/minepy/mictools.

Reading the data from mictool repository

datasaurus <- read.table("https://raw.githubusercontent.com/minepy/mictools/master/examples/datasaurus.txt", 
header=TRUE, row.names=1, as.is=TRUE, stringsAsFactors=FALSE)
datasaurus.m <- t(datasaurus)

Compute null distribution for tic_e

Automatically compute:

ticnull <- mictools(datasaurus.m, nperm=10000, seed=1234)

## Get the names of the named list
names(ticnull)
##[1]  "tic"      "nulldist" "obstic"   "obsdist"  "pval"


Null Distribution
ticnull$nulldist

| BinStart| BinEnd| NullCount| NullCumSum| |--------:|------:|---------:|----------:| | 0e+00| 1e-04| 0| 1e+05| | 1e-04| 2e-04| 0| 1e+05| | 2e-04| 3e-04| 0| 1e+05| | 3e-04| 4e-04| 0| 1e+05| | 4e-04| 5e-04| 0| 1e+05| | 5e-04| 6e-04| 0| 1e+05| | ... | ... | .... | .... |

Observed distribution
ticnull$obsdist

| BinStart| BinEnd| Count| CountCum| |--------:|------:|-----:|--------:| | 0e+00| 1e-04| 0| 325| | 1e-04| 2e-04| 0| 325| | 2e-04| 3e-04| 0| 325| | 3e-04| 4e-04| 0| 325| | 4e-04| 5e-04| 0| 325| | 5e-04| 6e-04| 0| 325| | ... | ... | .... | .... |

Plot tic_e and pvalue distribution.

hist(ticnull$tic)

hist(ticenull$pval, breaks=50, freq=FALSE)

Use p.adjust.method to use a different pvalue correction method, or use the qvalue package to use Storey's qvalue.

## Correct pvalues using qvalue
qobj <- qvalue(ticnull$pval$pval)

## Add column in the pval data.frame
ticnull$pval$qvalue <- qobj$qvalue
ticnull$pval

Same table as above with the qvalue column added at the end.

| pval| I1| I2|Var1 |Var2 | adj.P.Val| qvalue| |------:|--:|--:|:------|:------------|---------:|------:| | 0.5202| 1| 2|away_x |bullseye_x | 0.95| 1| | 0.9533| 1| 3|away_x |circle_x | 0.99| 1| | 0.0442| 1| 4|away_x |dino_x | 0.52| 0| | 0.6219| 1| 5|away_x |dots_x | 0.95| 1| | 0.8922| 1| 6|away_x |h_lines_x | 0.98| 1| | 0.3972| 1| 7|away_x |high_lines_x | 0.91| 1| | ... |...|...| ... | ... | ... | .... |

Strenght of the association (MIC)

## Use columns of indexes and FDR adjusted pvalue 
micres <- mic_strength(datasaurus.m, ticnull$pval, pval.col=c(6, 2, 3))

| TicePval| MIC| I1| I2| |:--------|:----|:--|:--| | 0.0457| 0.42| 2| 15| | 0.0000| 0.63| 3| 16| | 0.0196| 0.50| 5| 18| | 0.0162| 0.36| 9| 22| | 0.0000| 0.63| 10| 23| | 0.0000| 0.57| 13| 26| | ... | ... | ...|...|

Association strength computed based on the qvalue adjusted pvalue

## Use qvalue adjusted pvalue 
micresq <- mic_strength(datasaurus.m, ticnull$pval, pval.col=c("qvalue", "Var1", "Var2"))

| TicePval| MIC|I1 |I2 | |:--------|:----|:----------|:----------| | 0.0401| 0.42|bullseye_x |bullseye_y | | 0.0000| 0.63|circle_x |circle_y | | 0.0172| 0.50|dots_x |dots_y | | 0.0143| 0.36|slant_up_x |slant_up_y | | 0.0000| 0.63|star_x |star_y | | 0.0000| 0.57|x_shape_x |x_shape_y | | ... | ... | ... |... |

Citing minepy/minerva and mictools

||| |:----------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | minepy2013 | Davide Albanese, Michele Filosi, Roberto Visintainer, Samantha Riccadonna, Giuseppe Jurman and Cesare Furlanello. minerva and minepy:a C engine for the MINE suite and its R, Python and MATLAB wrappers. Bioinformatics (2013) 29(3): 407-408 first published online December 14, 2012 | | mictools2018 | Davide Albanese, Samantha Riccadonna, Claudio Donati, Pietro Franceschi. A practical tool for maximal information coefficient analysis. GigaScience (2018) |



Try the minerva package in your browser

Any scripts or data that you put into this service are public.

minerva documentation built on June 17, 2021, 9:09 a.m.