README.md

scCNAutils

Build Status codecov

A cleaned up version of the functions I used to analyze copy number aberrations in single-cell data. Still in development.

Main information below, other information in the docs folder, e.g.:

Citation

The code from this package was written and used primarily for the following study. Please cite the following if you use this package:

Couturier, C.P., Ayyadhury, S., Le, P.U. et al. Single-cell RNA-seq reveals that glioblastoma recapitulates a normal neurodevelopmental hierarchy. Nat Commun 11, 3406 (2020). https://doi.org/10.1038/s41467-020-17186-5

See the data-natcom2020 folder for data used in this paper (list of cancer/non-cancer cells).

Install

To make sure both CRAN and Bioconductor dependencies are installed, the easiest is to use the biocLite function from Bioconductor:

source('http://bioconductor.org/biocLite.R')
biocLite('jmonlong/scCNAutils')

More instructions about installing locally or on a HPC in the docs folder.

Usage

library(scCNAutils)
cells.df = auto_cna_signal(c('sampleA', 'sampleB'), 'genes.tsv', prefix='example', cell_cycle='cc_genes.tsv')
load('example-ge-coord-norm.RData')
cna.df = auto_cna_call(data, cells.df, prefix='example')

where

More instructions about how to get the genes.tsv and cc_genes.tsv files in the docs folder.

Instead of a list of folder paths, the input can also be a list of data.frames with a symbol column for gene names and then one column per cell. If named, the names of the list can be used as sample names. Otherwise use sample_names=.

Output

cells.df contains information at the cell level:

cna.df contains information about CNAs at the community level:

Graphs

Graphs are automatically generated by the auto_* functions but can also be produced from their output. For example, to customize the ggplot2 graphs. The output of the following functions are lists of ggplot2 objects.

qc.ggp = plot_qc_cells(cells.df)
comm.ggp = plot_communities(cells.df)
tsne.ggp = plot_tsne(cells.df)
cna.ggp = plot_cna(cna.df)

## For example changing colors of sample labels in tSNE
tsne.ggp$sample + scale_colour_brewer(palette='Set1')

Interactive graph

Instead of producing many times the same tSNE graph with different coloring, and to be able to zoom or investigate outlier cells, a simple Shiny app with a bokeh graph has been implemented. In addition to the zoom and color change, the user can hover over a cell to get its information.

tsne_browser(cells.df)

Workflow

From gene expression to CNA signal

The auto_cna_signal function calls the appropriate functions to go from raw expression to communities and tSNE based on CNA signal. The internal workflow is as follow:

From communities to CNA

The auto_cna_call function creates metacells per community and call CNAs. The internal workflow is as follow:

Next



jmonlong/scCNAutils documentation built on May 3, 2022, 4:34 a.m.