BioinfoMonzino/DaMiRseq: Data Mining for RNA-seq data: normalization, feature selection and classification
Version 1.3.1

The DaMiRseq package offers a tidy pipeline of data mining procedures to identify transcriptional biomarkers and exploit them for classification purposes. The package accepts any kind of data presented as a table of raw counts and allows including both continous and factorial variables that occur with the experimental setting. A series of functions enable the user to clean up the data by filtering genomic features and samples, to adjust data by identifying and removing the unwanted source of variation (i.e. batches and confounding factors) and to select the best predictors for modeling. Finally, a "Stacking" ensemble learning technique is applied to build a robust classification model. Every step includes a checkpoint that the user may exploit to assess the effects of data management by looking at diagnostic plots, such as clustering and heatmaps, RLE boxplots, MDS or correlation plot.

Getting started

Package details

AuthorMattia Chiesa <[email protected]>, Luca Piacentini <[email protected]>
Bioconductor views Classification RNASeq Sequencing
MaintainerMattia Chiesa <[email protected]>
LicenseGPL (>= 2)
Version1.3.1
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("devtools")
library(devtools)
install_github("BioinfoMonzino/DaMiRseq")
BioinfoMonzino/DaMiRseq documentation built on Dec. 15, 2017, 5:17 a.m.