Function for splitting SummarizedExperiment into separate RDS files


The splitSwish function splits up the y object along genes and writes a Snakefile that can be used with Snakemake to distribute running swish across genes. This workflow is primarily designed for large single cell datasets, and so the default is to not perform length correction within the distributed jobs. See the alevin section of the vignette for an example. See the Snakemake documention for details on how to run and customize a Snakefile: https://snakemake.readthedocs.io


splitSwish(y, nsplits, prefix = "swish", snakefile = NULL, overwrite = FALSE)



a SummarizedExperiment


integer, how many pieces to break y into


character, the path of the RDS files to write out, e.g. prefix="/path/to/swish" will generate swish.rds files at this path


character, the path of a Snakemake file, e.g. Snakefile, that should be written out. If NULL, then no Snakefile is written out


logical, whether the snakefile and RDS files (swish1.rds, ...) should overwrite existing files


nothing, files are written out


Compression and splitting across jobs:

Van Buren, S., Sarkar, H., Srivastava, A., Rashid, N.U., Patro, R., Love, M.I. (2020) Compression of quantification uncertainty for scRNA-seq counts. bioRxiv. https://doi.org/10.1101/2020.07.06.189639


Koster, J., Rahmann, S. (2012) Snakemake - a scalable bioinformatics workflow engine. Bioinformatics. https://doi.org/10.1093/bioinformatics/bts480

