knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  warning = FALSE
)

Load data

library (TBdev)
library (WGCNA)

Load scRNAseq data

root <- '.'
data_dir <- paste (root, 'data', sep='/')
save_dir <- paste (root, 'result', sep='/')
all_data <- get (load (paste (data_dir, 'final_merged_tb.Robj', sep='/') ))

# For simplicity we will only use a subset of the dataset, i.e., the in vivo
# set
all_data <- all_data [, all_data$date != 'in_vitro']

WGCNA

This function implements most of the functionality of WGCNA. This includes selection of an appropriate power for scale-free network, using this weight power to calculate connectivity matrix, and hierarchical clustering of the matrix. The eigen-gene of each cluster in each cell is returned. Owing to the various results it generates, we make it compulsory to store the results in a directory. A directory called WGCNA will be created under the directory.

find_eigengene (all_data, save_dir, 
                minModuleSize =25, #can tune the final number of clusters
                cluster_num='all' #use the entire dataset
)

Read the stored cluster information. WGCNA tends to name clusters by color names. To avoid confusion, here they are renamed as 'GC1', 'GC2' etc.

color_row <- utils::read.csv ( paste ( save_dir, 
                        'WGCNA/module_genes.csv' , sep='/'), row.names=1)
gene_list <- lapply (as.list (colnames (color_row) ), function (x) {
                             unique (color_row [, x]) })
names (gene_list) <- colors2labels (colnames (color_row), prefix='GC')

We recommend calculating the module scores instead of using eigengene to quantify the expression level of particular WGNCA clusters

WG_data <- get_module_score (all_data, paste (save_dir, 
                             'WGCNA/Data_module_score.csv', sep='/'), 
                             pgenes=gene_list, 
                             append_meta=T #return a seurat object
)

Visualise in heatmap

AP <- list (pointsize=3, legend_point_size=3, fontsize=11, point_fontsize=4,
            font_fam= 'sans')
seurat_heat (WG_data, cluster_rows=T,
             group.by=c('broad_type','date'),
             main_width=15, main_height=10,
             column_rotation=90,
             grid_height=4, center_scale=T, AP=AP,
             automatic=F
)

View gene-gene correlation network in a graph format. The edge thickness represents the correlation value. The node side represents average gene expression level.

For example, let us look at the genes in cluster 4, whose appears elevated in extravillous trophoblast (EVT).

set.seed (100)
plot_WGCNA_net (all_data, gene_list$GC4, 
                threshold=0.75,
                #only show gene pair whose correlation is above a threshold
                scale_node_size=0.05) #the smaller the value the larger the size


Yutong441/TBdev documentation built on Dec. 18, 2021, 8:22 p.m.