README.md

sketchR

sketchR

R-CMD-check

sketchR provides a simple interface to the geosketch and scSampler python packages, which implement subsampling algorithms described in Hie et al (2019) and Song et al (2022), respectively. The implementation makes use of the basilisk package for interaction between R and python.

Installation

You can install sketchR from Bioconductor (release 3.19 onwards) using:

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("sketchR")

Example

library(sketchR)

## Create an example data matrix. Rows represent "samples" (the unit of 
## downsampling), columns represent features (e.g., principal components).
mat <- matrix(rnorm(5000), nrow = 500)

## Run geosketch. The output is a vector of indices, which you can use 
## to subset the rows of the input matrix.
idx <- geosketch(mat, N = 100)

## Run scSampler. As for geosketch, the output is a vector of indices.
idx2 <- scsampler(mat, N = 100)


csoneson/sketchR documentation built on May 2, 2024, 1:39 a.m.