README.md

GRUFFI

Published in Gruffi: an algorithm for computational removal of stressed cells from brain organoid transcriptomic datasets

status: active

The Gruffi R package helps you (1) to identify stressed cells in single-cell RNA-seq datasets using granular funcitonal filtering, or (2) you can use it to analyze any gene set's pathway activity (GO-term or custom defined), and define a cell population based on their combined activity.

Gruffi integrates into any Seurat analysis pipelione & it comes with a graphical user interface.

News

Added functionality

Installation

You can install dependencies from CRAN, Bioconductor and GitHub via devtools:


# Install CRAN & Bioconductor dependencies
install.packages('BiocManager')
BiocManager::install("DOSE")
BiocManager::install("org.Hs.eg.db")
BiocManager::install("sparseMatrixStats")
BiocManager::install("biomaRt")
BiocManager::install("raster")
BiocManager::install("rgl")


# Install custom dependencies
install.packages('devtools')
devtools::install_github(repo = "vertesy/Stringendo", upgrade = F)
devtools::install_github(repo = "vertesy/CodeAndRoll2", upgrade = F)
devtools::install_github(repo = "vertesy/ReadWriter", upgrade = F)
devtools::install_github(repo = "vertesy/MarkdownHelpers", upgrade = F)
devtools::install_github(repo = "vertesy/Markdownreports", upgrade = F)
devtools::install_github(repo = "vertesy/ggExpress", upgrade = F)
devtools::install_github(repo = "vertesy/Seurat.utils", upgrade = F)


# Install gruffi
devtools::install_github(repo = "jn-goe/gruffi", upgrade = F)

NOTE: If you type 'all' when R asks to update dependencies, you may get into installation errors / infinite loops. If updating fails, type 'no' when prompted.

Usage

image

1. Setup

Load your Seurat single-cell object of interest (in the following combined.obj), perform basic analysis steps (Integration, PCA, Clustering) and calculate 2D and 3D umaps.

library(gruffi)

combined.obj <- Seurat.utils::SetupReductionsNtoKdimensions(obj = combined.obj, nPCs = 50, dimensions=3:2, reduction="umap")
# Note that this will recalculate 2D and 3D umaps, and back them up in combined.obj@misc$reductions.backup. 
# See FAQ.md, if you don't want to overwrite the 2D UMAP.

If you want to store multiple UMAP's, you have to keep them in backup slot (in combined.obj@misc). Use combined.obj <- RecallReduction(obj = combined.obj, dim = 2, reduction="umap") to access them.

Prepare GO-terms and gene-sets

go1 <- "GO:0006096" # Glycolysis
go2 <- "GO:0034976" # ER-stress
go3 <- "GO:0042063" # Gliogenesis, negative filtering

2. Granule partitioning

Gruffi works best if you partition cells into groups of 100-200 cells. Find the corresponding clustering resolution by:

combined.obj <- AutoFindGranuleResolution(obj = combined.obj)

The optimal resolution found is stored in:

(granule.res.4.gruffi <- GetGruffiClusteringName(combined.obj)) # Recalled from @misc$gruffi$'optimal.granule.res.

Some granules have too few cells, therfore their scoring is not robust statistically. Use ReassignSmallClusters to assign these cells to the nearest large-enough granule:

combined.obj <- ReassignSmallClusters(combined.obj, ident = granule.res.4.gruffi) # Will be stored in meta data column as "seurat_clusters.reassigned".
````

Above, granules with <30 cells are cell-by-cell re-assigned to a neighboring granule (by default based on Euclidean distance between the mean of cell groups in 3dim UMAP space). The reassigned granules are suffixed as :

```r
(granule.res.4.gruffi <- GetGruffiClusteringName(combined.obj))

3. Pathway scoring

After finding the right granule resolution, first GO scores per cells, then average GO-scores per granule (normalized for respective cell numbers) are computed. Within the GO score computation an ensembl database will be loaded (that you created above as ensembl).

# Glycolytic process    GO:0006096
combined.obj <- AssignGranuleAverageScoresFromGOterm(obj = combined.obj, GO_term = go1, save.UMAP = TRUE, new_GO_term_computation = T, clustering = granule.res.4.gruffi, plot.each.gene = F)

# ER stress     GO:0034976
combined.obj <- AssignGranuleAverageScoresFromGOterm(obj = combined.obj, GO_term = go2, save.UMAP = TRUE, new_GO_term_computation = T, clustering = granule.res.4.gruffi, plot.each.gene = F)

# Gliogenesis       GO:0042063
combined.obj <- AssignGranuleAverageScoresFromGOterm(obj = combined.obj, GO_term = go3, save.UMAP = TRUE, new_GO_term_computation = T, clustering = granule.res.4.gruffi, plot.each.gene = F)

These functions store the resulting scores in combined.obj@meta.data.

4. Stress filtering

We will now call a Shiny Interface to auto-estimate and/or manually adjust the stress assignment threeholds via FindThresholdsShiny().

Example code for filtering cells high in glycolytic process and ER stress but low in gliogenesis:

# Create score names:
(i1 <- ParseGruffiGranuleScoreName(goID = go1))
(i2 <- ParseGruffiGranuleScoreName(goID = go2))
(i3 <- ParseGruffiGranuleScoreName(goID = go3))

# Call Shiny app
combined.obj <- FindThresholdsShiny(obj = combined.obj,
                                stress.ident1 = i1,
                                stress.ident2 = i2,
                                notstress.ident3 = i3)

"Dont forget to click the button in the app: Save New Thresholds"

The interface allows automatic estimation and manual adjustment of thresholds for stress scores:

ShinyInterface_Vel Org7 d90 ImpV

After pushing the Save new thresholds button in the Shiny graphical user interface, thresholds are saved in combined.obj@misc$gruffi and the stress assignment is stored as a new @meta.data column, is.Stressed. Check results with:

5. Visualize results

StressUMAP(combined.obj)
StressBarplotPerCluster(combined.obj,  group.by = GetClusteringRuns()[1])
GrScoreHistogram(combined.obj, colname = i1, miscname = "thresh.stress.ident1")
# and more

6. Remove stressed cells

cellIDs.keep <- which_names(!combined.obj$'is.Stressed')
subset.obj <- subset(x = combined.obj, cells = cellIDs.keep)  

StressUMAP(subset.obj, title = "No Stressed Cells Remain After Gruffi", suffix = "after.cleanup")

Troubleshooting

For Errors, Problems & Questions: please see the FAQ or raise an issue in the github repository. We try to answer.



jn-goe/gruffi documentation built on Nov. 7, 2024, 10:38 p.m.