README.md

Single CEll Variational Aneuploidy aNalysis

Article: A variational algorithm to detect the clonal copy number substructure of tumors from scRNA-seq data

Introduction

SCEVAN is an R package that starting from the raw count matrix of scRNA data automatically classifies the cells present in the biopsy by segregating non-malignant cells of tumor microenviroment from the malignant cells and also characterizes the clonal structure of these malignant cells. It identfies cell subpopulations with different copy number architecture and reports g the specific and shared alterations of each subpopulation. The aim of the tool is to automate the entire analysis by allowing it to be performed in a very simple and completely unsupervised way. Analyses carried out on 106 samples and 93332 cells show better classification with an F1 score for all samples of 0.90 compared to 0.63 obtained with the state-of-the-art tools. It also explits a greedy multichannel segmentation algorithms making it particularly fast even for large datasets.

Installation

library(devtools)
install_github("miccec/yaGST")
install_github("AntonioDeFalco/SCEVAN")
library(SCEVAN)

Usage

Single-sample analysis

A single call (pipelineCNA) allows the execution of the entire analysis of classification and characterization of clonal structure.

results <- pipelineCNA(count_mtx)

Multi-sample analysis

A single call (multiSampleComparisonClonalCN) allows the comparison of clonal profiles of multiple samples.

multiSampleComparisonClonalCN(listCountMtx)

Integration with Seurat

Integration of information obtained with SCEVAN (aneuploidy/diploid, subclones) into Seurat object.

results <- pipelineCNA(count_mtx)

#Create Seurat Object with SCEVAN info
seurObj <- Seurat::CreateSeuratObject(count_mtx, meta.data = results)

#or add SCEVAN info to an existing Seurat object
seurObj <-Seurat::AddMetaData(seurObj, metadata = results)

If you want to plot CN information at the single-cell level, you can obtain the region of the alteration of interest from the *.seg file and plot the inferred CN ratio from CNA matrix, for example, like this:

load("output/MGH106_count_mtx_annot.RData")
load("output/MGH106_CNAmtx.RData")

chr3 <- apply(CNA_mtx_relat[count_mtx_annot$seqnames==3 & count_mtx_annot$start>=158644278 & count_mtx_annot$end<=194498364,], 2, mean)
chr3 <- chr3[rownames(seur_obj@meta.data)]
names(chr3) <- rownames(seur_obj@meta.data)
chr3 <- as.data.frame(chr3)

seur_obj <- AddMetaData(seur_obj, metadata = chr3)
Seurat::FeaturePlot(seur_obj, "chr3", cols = c("gray", "red"))

image

Usage examples (vignettes)

Docker

We provide a ready-to-run Docker container that includes SCEVAN R package and dependencies. Example of usage:

docker run -v /Users/antonio/SCEVAN_vignette1:/home/SCEVAN_vignette1 -it anthonyphis/r_scevan:latest Rscript /home/SCEVAN_vignette1/script_vignette1.R

Sample Datasets

We provide some pre-processed samples used in the examples (vignettes):

Citation

@article {De Falco2023,\        author = {De Falco, Antonio and Caruso, Francesca and Su, Xiao-Dong and Varone, Antonio and Ceccarelli, Michele},\        title = {A variational algorithm to detect the clonal copy number substructure of tumors from scRNA-seq data},\        year = {2023} \        month = {02}, \        pages = {1074},\        volume = {14}, \        doi = {10.1038/s41467-023-36790-9},\        URL = { https://www.nature.com/articles/s41467-023-36790-9 },        eprint = { https://www.nature.com/articles/s41467-023-36790-9.pdf }, \        journal = {Nature Communications}\ }



AntonioDeFalco/SCEVAN documentation built on April 16, 2024, 10:56 a.m.