scrublet_R: Scrublet

Description Usage Arguments Value

View source: R/scrublet.R

Description

See preprint: Scrublet: computational identification of cell doublets in single-cell transcriptomic data Samuel L Wolock, Romain Lopez, Allon M Klein. bioRxiv 357368; doi: https://doi.org/10.1101/357368

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
scrublet_R(
  data,
  python_home = system("which python", intern = TRUE),
  return_results_only = FALSE,
  min_counts = 2,
  min_cells = 3,
  expected_doublet_rate = 0.1,
  min_gene_variability_pctl = 85,
  n_prin_comps = 30,
  sim_doublet_ratio = 2,
  n_neighbors = NULL
)

Arguments

data

expression matrix

python_home

The python home directory where Scrublet is installed

return_results_only

bool (optional, default False)

min_counts,

int (optional, default=2), See scrublet reference

min_cells,

int (optional, default=3), See scrublet reference

expected_doublet_rate,

float (optional, default=0.1), See scrublet reference - expected_doublet_rate: the fraction of transcriptomes that are doublets, typically 0.05-0.1. Results are not particularly sensitive to this parameter. For this example, the expected doublet rate comes from the Chromium User Guide: https://support.10xgenomics.com/permalink/3vzDu3zQjY0o2AqkkkI4CC

min_gene_variability_pctl,

int (optional, default=85), See scrublet reference

n_prin_comps,

int (optional, default=30), See scrublet reference (Number of principal components to use)

sim_doublet_ratio,

int (optional, default=2), the number of doublets to simulate, relative to the number of observed transcriptomes. This should be high enough that all doublet states are well-represented by simulated doublets. Setting it too high is computationally expensive. The default value is 2, though values as low as 0.5 give very similar results for the datasets that have been tested.

n_neighbors,

int (optional) n_neighbors: Number of neighbors used to construct the KNN classifier of observed transcriptomes and simulated doublets. The default value of round(0.5*sqrt(n_cells)) generally works well. Return only a list containing scrublet output

Value

The doublet_score output from scrublet,


anirudhpatir/scPy2R documentation built on Feb. 28, 2021, 7:42 a.m.