knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
scWGCNA is an adaptation of WGCNA to work with single-cell datasets. The functionality is presented in Feregrino & Tschopp 2021
The new version of the package allows for a better workflow, and more interaction with the data. The package no longer produces markdown html reports, nor it saves files by itself.
scWGCNA works with Seurat objects.
To install scWGCNA run in R:
devtools::install_github("cferegrino/scWGCNA", ref="main")
A few words on data:
This package is an adaptation of WGCNA, which was originally intended to find modules of co-expression in RNA-seq bulk data. We are therefore not sure of how it would behave with multi-batch data, and batch effects. We suggest to calculate modules based on one sample / library.
If you are planning to compare the modules of co-expression across samples (conditions / species) try to restrict the analysis to genes that are present and expressed across all samples, to obtain the best results.
This package works with Seurat objects, we therefore suggest to bare in mind and control what your default assay is, since this is used by the different functions.
library(scWGCNA) # A seurat object we use as example my.small_MmLimbE155 # Calculate the pseudocells MmLimbE155.pcells = calculate.pseudocells(s.cells = my.small_MmLimbE155, # Single cells in Seurat object seeds=0.2, # Fraction of cells to use as seeds to aggregate pseudocells nn = 10, # Number of neighbors to aggregate reduction = "pca", # Reduction to use dims = 1:10) # The dimensions to use
# Run scWGCNA MmLimbE155.scWGCNA = run.scWGCNA(p.cells = MmLimbE155.pcells, # Pseudocells (recommended), or Seurat single cells s.cells = my.small_MmLimbE155, # single cells in Seurat format is.pseudocell = T, # We are using single cells twice this time features = rownames(my.small_MmLimbE155)) # Recommended: variable genes
# Plot the gene / modules dendrogram scW.p.dendro(scWGCNA.data = MmLimbE155.scWGCNA) #Look at the membership tables names(MmLimbE155.scWGCNA$modules) #Let's look at the first module, "blue" head(MmLimbE155.scWGCNA$modules$`1_blue`) # We use gene unique identifiers. For this reason we need to translate them to names my.module = MmLimbE155.scWGCNA$modules$`1_blue` # We have gene names translation in the misc slot of this seurat object. my.gnames = my.small_MmLimbE155@misc$gnames head(my.gnames) # Look at the table again my.module$gname = my.gnames[rownames(my.module), "Gene.name"] head(my.module) # Here we can see what is the expression of each co-expression module, per cell. This is in one of the list items head(MmLimbE155.scWGCNA[["sc.MEList"]]$averageExpr) # If we want to see the average module expression per pseudocell instead of single cells, we find it here head(MmLimbE155.scWGCNA[["MEList"]]$averageExpr)
# Here we can calculate the different eigengenes, using the single cell data from the mouse. MmLimb.eigengenes.sc = scW.eigen(modules = MmLimbE155.scWGCNA, seurat = my.small_MmLimbE155) # We can then find the expression of the different modules in this slot head(MmLimb.eigengenes.sc$Module.Eigengenes$averageExpr) # we can explore the membership of each gene to its module, by examining the scWGCNA data list object head(MmLimbE155.scWGCNA[["modules"]]$`1_blue`) # And we can access the membership and pvalues calculated from another data set, by looking at these tables head(MmLimb.eigengenes.sc$Gene.Module.Membership) head(MmLimb.eigengenes.sc$Module.Membership.Pvalue)
# Plot the expression of all modules at once scW.p.expression(s.cells = my.small_MmLimbE155, # Single cells in Seurat format scWGCNA.data = MmLimbE155.scWGCNA, # scWGCNA list dataset modules = "all", # Which modules to plot? reduction = "tsne", # Which reduction to plot? ncol=3) # How many columns to use, in case we're plotting several? #Plot only the expression of the first module, "blue" scW.p.expression(s.cells = my.small_MmLimbE155, scWGCNA.data = MmLimbE155.scWGCNA, modules = 1)
# First generate the networks in the scWCGNA object MmLimbE155.scWGCNA = scWGCNA.networks(scWGCNA.data = MmLimbE155.scWGCNA) # Plot the module "blue" as a network scW.p.network(MmLimbE155.scWGCNA, module=1) # Plot the module "blue" as a network. Again, we are using gene unique identifiers, not very informative. # For this, we use the gene names translation we had before. scW.p.network(MmLimbE155.scWGCNA, module=1, gnames = my.gnames)
# A Seurat object with chicken limb cells my.small_GgLimbHH29 # We calculate pseudocells for this object Gg.ps=calculate.pseudocells(my.small_GgLimbHH29, dims = 1:10) # Our Seurat objects contain gene names equivalencies in the misc slot. # We use them to biuld a table of orthologous genes my.ortho = merge(my.small_MmLimbE155@misc$gnames,my.small_GgLimbHH29@misc$gnames, by = "Gene.name") head(my.ortho) my.ortho=my.ortho[,2:3] # We then run the comparative analysis MmvGg.comparative = scWGNA.compare(scWGCNA.data = MmLimbE155.scWGCNA, test.list = list(Gg.ps), test.names = c("Gg"), ortho = my.ortho, # not needed unless reference and tests have different gene names ortho.sp = c(2))
We can now check if any genes were "lost", due to not being present as 1-to-1 orthologues (if it's the case), or not being expressed in all samples.
Very important, we suggest to make the calculation of modules, with genes that you will already find in other samples from the very beginning. Nonetheless, we show here how to go about in case you didn't.
# We can see how many genes are present in each module, at each category. In the misc slot of the scWGCNA comparative list object MmvGg.comparative$misc$modulefrac # And the identity of the lost genes is found under the same slot MmvGg.comparative$misc$geneslost # We can also plot these fractions as barplots, for each module we used for comparisons scW.p.modulefrac(MmvGg.comparative)
Here, we can plot 4 different aspects, the zscore of the overall preservation, which according to the original authors of WGCNA, is understood as a threshold. Values under 2 mean no evidence of preservation in the test sample, over 10 mean strong evidence of preservation. Moreover, we can plot the median rank, which we can use to compare the preservation of the different modules. And, to go into details of the modules, we can plot the zscore of density and connectivity preservation.
# Here we can plot the overall preservation and median rank. scW.p.preservation(scWGCNA.comp.data = MmvGg.comparative, to.plot=c("preservation", "median.rank")) # We can also plot the preservation of density and connectivity. scW.p.preservation(scWGCNA.comp.data = MmvGg.comparative, to.plot=c("density", "connectivity"))
While these are the only aspects we plot here, all other aspects of the preservation can be explored in the list object we obtained. We refer the user to read the original publications: Langfelder P, Luo R, Oldham MC, Horvath S (2011) Is My Network Module Preserved and Reproducible?. PLOS Computational Biology 7(1): e1001057
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.