README.md

ILoReg

Introduction

ILoReg is a novel tool for cell population identification from single-cell RNA-seq (scRNA-seq) data. In our study [1], we showed that ILoReg was able to identify, by both unsupervised clustering and visually, rare cell populations that other scRNA-seq data analysis pipelines were unable to identify.

The figure below illustrates the workflows of ILoReg and a typical pipeline that applies feature selection prior to dimensionality reduction by principal component analysis (PCA).

*Figure: Analysis workflows of ILoReg and a feature-selection based approach*

In contrast to most scRNA-seq data analysis pipelines, ILoReg does not reduce the dimensionality of the gene expression matrix by feature selection. Instead, it performs probabilistic feature extraction using iterative clustering projection (ICP), yielding a probability matrix, which contains probabilities of each of the N cells belonging to the k clusters. ICP is a novel self-supervised learning algorithm that iteratively seeks a clustering with k clusters that maximizes the adjusted Rand index (ARI) between the clustering and its projection by L1-regularized logistic regression. In the ILoReg consensus approach, ICP is run L times and the L probability matrices are merged into a joint probability matrix and subsequently transformed by principal component analysis (PCA) into a lower dimensional matrix (consensus matrix). The final clustering step is performed using hierarhical clustering by the Ward's method, after which the user can extract a clustering with K consensus clusters. Two-dimensional visualization is supported using two popular nonlinear dimensionality reduction methods: t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP). Additionally, ILoReg provides user-friendly functions that enable identification of differentially expressed (DE) genes and visualization of gene expression.

Installation

The latest version of ILoReg can be downloaded from GitHub using the devtools R package.


devtools::install_github("elolab/ILoReg")

Example

Please follow this link to an example, in which a peripheral blood mononuclear cell (PBMC) dataset is analyzed using ILoReg.

Contact information

If you have questions related to ILoReg, please contact us here.

References

  1. Johannes Smolander, Sini Junttila, Mikko S Venäläinen, Laura L Elo. "ILoReg enables high-resolution cell population identification from single-cell RNA-seq data". Preprint at https://www.biorxiv.org/content/10.1101/2020.01.20.912675v1 (2020).


Try the ILoReg package in your browser

Any scripts or data that you put into this service are public.

ILoReg documentation built on Nov. 8, 2020, 8:20 p.m.