zh542370159/dropSplit: Automatically identify cell-containing and empty droplets for droplet-based scRNAseq data using dropSplit

Automatically and accurately split droplets into 'Cell' and 'Empty' for droplet-based scRNAseq data using dropSplit. dropSplit is designed to accurately identify 'Cell' droplets for the droplet-based scRNAseq data. It consists of four main steps: * Pre-define droplets as 'Cell', 'Uncertain', 'Empty' and 'Discarded' droplets according to the RankMSE curve. * Simulate 'Cell' and 'Uncertain' droplets under a depth of 'Empty' used for model construction and prediction. * Iteratively buid the model, classify 'Uncertain' droplets, and update the training set use newly predicted 'Empty'. * Make classification with the optimal model. dropSplit provides some special droplet QC metrics such as CellEntropy or CellGini and plot function, which can help quickly check quality of the prediction. In general, user can use the predefined parameters in the XGBoost. It also provides a automatic XGBoost hyperparameters-tuning function to optimize the model.

Getting started

Package details

AuthorHao Zhang
MaintainerHao Zhang <542370159@qq.com>
LicenseAGPL-3
Version1.0.0
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("remotes")
remotes::install_github("zh542370159/dropSplit")
zh542370159/dropSplit documentation built on June 19, 2022, 2:49 p.m.