scsampler: Run scSampler to subsample a matrix

View source: R/scsampler.R

scsamplerR Documentation

Run scSampler to subsample a matrix

Description

Perform subsampling with the scSampler python package.

Usage

scsampler(mat, N, random_split = 1, seed = 0)

Arguments

mat

m x n matrix. Samples (the dimension along which to subsample) should be in the rows, features in the columns.

N

Numeric scalar, the number of samples to retain.

random_split

Numeric scalar, the number of parts to randomly split the data into before subsampling within each part. A larger value will speed up computations, but give less optimal results.

seed

Numeric scalar, passed to scsampler to seed the random number generator.

Details

The first time this function is run, it will create a conda environment containing the scSampler package. This is done via the basilisk R/Bioconductor package - see the documentation for that package for troubleshooting.

Value

A numeric vector with indices to retain.

Author(s)

Charlotte Soneson, Michael Stadler

References

Song et al (2022): scSampler: fast diversity-preserving subsampling of large-scale single-cell transcriptomic data. bioRxiv doi:10.1101/2022.01.15.476407

Examples

x <- matrix(rnorm(500), nrow = 100)
scsampler(mat = x, N = 10)


csoneson/geosketchR documentation built on Nov. 5, 2024, 4:35 a.m.