knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

scWGCNA V 1.0.0

scWGCNA is an adaptation of WGCNA to work with single-cell datasets. The functionality is presented in Feregrino & Tschopp 2021

The new version of the package allows for a better workflow, and more interaction with the data. The package no longer produces markdown html reports, nor it saves files by itself.

scWGCNA works with Seurat objects.

References

Langfelder, P., Horvath, S. (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559

Langfelder P, Luo R, Oldham MC, Horvath S (2011) Is My Network Module Preserved and Reproducible?. PLOS Computational Biology 7(1): e1001057

Feregrino, C, Tschopp, P (2021) Assessing evolutionary and developmental transcriptome dynamics in homologous cell types. bioRxiv 2021.02.09.430383

Installation

To install scWGCNA run in R:

devtools::install_github("cferegrino/scWGCNA", ref="main")

Basic scWGCNA workflow

A few words on data:

Pseudocell calculation

library(scWGCNA)

# A seurat object we use as example
my.small_MmLimbE155

# Calculate the pseudocells
MmLimbE155.pcells = calculate.pseudocells(s.cells = my.small_MmLimbE155, # Single cells in Seurat object
                                          seeds=0.2, # Fraction of cells to use as seeds to aggregate pseudocells
                                          nn = 10, # Number of neighbors to aggregate
                                          reduction = "pca", # Reduction to use
                                          dims = 1:10) # The dimensions to use

single-cell WGCNA

# Run scWGCNA
MmLimbE155.scWGCNA = run.scWGCNA(p.cells = MmLimbE155.pcells, # Pseudocells (recommended), or Seurat single cells
                                 s.cells = my.small_MmLimbE155, # single cells in Seurat format
                                 is.pseudocell = T, # We are using single cells twice this time
                                 features = rownames(my.small_MmLimbE155)) # Recommended: variable genes

Modules of co-expression

# Plot the gene / modules dendrogram
scW.p.dendro(scWGCNA.data = MmLimbE155.scWGCNA)

#Look at the membership tables
names(MmLimbE155.scWGCNA$modules)

#Let's look at the first module, "blue"
head(MmLimbE155.scWGCNA$modules$`1_blue`)

# We use gene unique identifiers. For this reason we need to translate them to names
my.module = MmLimbE155.scWGCNA$modules$`1_blue`

# We have gene names translation in the misc slot of this seurat object.
my.gnames = my.small_MmLimbE155@misc$gnames
head(my.gnames)

# Look at the table again
my.module$gname = my.gnames[rownames(my.module), "Gene.name"]
head(my.module)

# Here we can see what is the expression of each co-expression module, per cell. This is in one of the list items
head(MmLimbE155.scWGCNA[["sc.MEList"]]$averageExpr)

# If we want to see the average module expression per pseudocell instead of single cells, we find it here
head(MmLimbE155.scWGCNA[["MEList"]]$averageExpr)
# Here we can calculate the different eigengenes, using the single cell data from the mouse.
MmLimb.eigengenes.sc = scW.eigen(modules = MmLimbE155.scWGCNA, seurat = my.small_MmLimbE155)

# We can then find the expression of the different modules in this slot
head(MmLimb.eigengenes.sc$Module.Eigengenes$averageExpr)

# we can explore the membership of each gene to its module, by examining the scWGCNA data list object
head(MmLimbE155.scWGCNA[["modules"]]$`1_blue`)

# And we can access the membership and pvalues calculated from another data set, by looking at these tables
head(MmLimb.eigengenes.sc$Gene.Module.Membership)

head(MmLimb.eigengenes.sc$Module.Membership.Pvalue)
# Plot the expression of all modules at once
scW.p.expression(s.cells = my.small_MmLimbE155, # Single cells in Seurat format
                       scWGCNA.data = MmLimbE155.scWGCNA, # scWGCNA list dataset
                       modules = "all", # Which modules to plot?
                       reduction = "tsne", # Which reduction to plot?
                       ncol=3) # How many columns to use, in case we're plotting several?

#Plot only the expression of the first module, "blue"
scW.p.expression(s.cells = my.small_MmLimbE155,
                       scWGCNA.data = MmLimbE155.scWGCNA,
                       modules = 1)
# First generate the networks in the scWCGNA object
MmLimbE155.scWGCNA = scWGCNA.networks(scWGCNA.data = MmLimbE155.scWGCNA)

# Plot the module "blue" as a network
scW.p.network(MmLimbE155.scWGCNA, module=1)

# Plot the module "blue" as a network. Again, we are using gene unique identifiers, not very informative.
# For this, we use the gene names translation we had before.
scW.p.network(MmLimbE155.scWGCNA, module=1, gnames = my.gnames)

Module comparison accross samples

# A Seurat object with chicken limb cells
my.small_GgLimbHH29

# We calculate pseudocells for this object
Gg.ps=calculate.pseudocells(my.small_GgLimbHH29, dims = 1:10)

# Our Seurat objects contain gene names equivalencies in the misc slot.
# We use them to biuld a table of orthologous genes
my.ortho = merge(my.small_MmLimbE155@misc$gnames,my.small_GgLimbHH29@misc$gnames, by = "Gene.name")
head(my.ortho)
my.ortho=my.ortho[,2:3]

# We then run the comparative analysis
MmvGg.comparative = scWGNA.compare(scWGCNA.data = MmLimbE155.scWGCNA,
                                   test.list = list(Gg.ps),
                                   test.names = c("Gg"),
                                   ortho = my.ortho, # not needed unless reference and tests have different gene names
                                   ortho.sp = c(2))
# We can see how many genes are present in each module, at each category. In the misc slot of the scWGCNA comparative list object
MmvGg.comparative$misc$modulefrac

# And the identity of the lost genes is found under the same slot
MmvGg.comparative$misc$geneslost

# We can also plot these fractions as barplots, for each module we used for comparisons
scW.p.modulefrac(MmvGg.comparative)

Here, we can plot 4 different aspects, the zscore of the overall preservation, which according to the original authors of WGCNA, is understood as a threshold. Values under 2 mean no evidence of preservation in the test sample, over 10 mean strong evidence of preservation. Moreover, we can plot the median rank, which we can use to compare the preservation of the different modules. And, to go into details of the modules, we can plot the zscore of density and connectivity preservation.

# Here we can plot the overall preservation and median rank.
scW.p.preservation(scWGCNA.comp.data = MmvGg.comparative,
                   to.plot=c("preservation", "median.rank"))

# We can also plot the preservation of density and connectivity.
scW.p.preservation(scWGCNA.comp.data = MmvGg.comparative,
                   to.plot=c("density", "connectivity"))

While these are the only aspects we plot here, all other aspects of the preservation can be explored in the list object we obtained. We refer the user to read the original publications: Langfelder P, Luo R, Oldham MC, Horvath S (2011) Is My Network Module Preserved and Reproducible?. PLOS Computational Biology 7(1): e1001057



CFeregrino/scWGCNA documentation built on Nov. 21, 2022, 2:31 a.m.