README.md
In martijnvanattekum/cleanse: Tidyverse functions for SummarizedExperiment

cleanse

The SummarizedExperiment (se) class offers a useful way to store multiple row and column metadata along with the values from an experiment and is widely used in computational biology. Although subsetting se's is possible with base R notation (ie using []), se's cannot be manipulated using grammar from the tidyverse. As a consequence, it is not possible to manipulate se's in pipelines using the pipe operator.

This package contains a number of wrapper functions to extend the usage of se's: - dplyr functions: to use dplyr's grammar of data manipulation - arithmetic functions: to perform arithmetic on 2 se's - write functions: to print the options of a se and to write se's to delimited files

As an example, compare how cleanse is used to subset rows for gene_group NOTCH and then arrange the columns by patient | Using native syntax | Using cleanse | |:----------------------------------------------------------------------------------|:---------------------------------------| |

rowdata <- rowData(se)se <- se[rowdata$gene_group == "NOTCH", ]se <- se[, order(se$patient)]

se <- se %>%    filter(row, gene_group == "NOTCH") %>%    arrange(col, patient)

Usage information can be found by reading the vignettes: browseVignettes("cleanse").

Functions that subset the se based on the rowData or colData - filter() picks rows/cols based on the se's attached rowData/colData - slice() picks rows/cols by position - arrange() changes the ordering of the rows - sample_slice() picks a random portion of rows or cols from the se.

Functions that change the se's rowData or colData - select() selects variables - rename() renames variables - mutate() adds new variables that are functions of existing variables - drop_metadata() drops all rowData and colData having only 1 unique value

- subtracts values from the assays in 2 se's
+ adds values from the assays in 2 se's
/ divides values from the assays in 2 se's
* multiplies values from the assays in 2 se's
round rounds the assay values of a se

write_csv() writes a se to csv
write_tsv() writes a se to tsv
write_delim() writes a se to a delimited file

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("cleanse")

library(cleanse)

# -- An example se called seq_se is provided

# Example pipe
data(seq_se)
seq_se %>%
  filter(row, gene_group == "NOTCH") %>%
  filter(col, site %in% c("brain", "skin")) %>%
  arrange(col, patient) %>%
  round(3)

# Example sampling
data(seq_se)
seq_se %>% slice_sample(row, prop=.2)

# Example arithmetic subtracting the expression values at T=0 from T=4
data(seq_se)
(filter(seq_se, col, time == 4)) - (filter(seq_se, col, time == 0))

If you encounter a clear bug, please file a minimal reproducible example on github.

martijnvanattekum/cleanse documentation built on Nov. 20, 2023, 8:28 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com