How to merge/standardize GatingSets

How to merge/standardize GatingSets

Usage

gs_split_by_tree(x)
gs_check_redundant_nodes(x)
gs_remove_redundant_nodes(x,toRemove)
gs_remove_redundant_channels(gs, ...)

Arguments

x
'GatingSet' objects or or list of groups (each group member is a list of 'GatingSet`)

toRemove list of the node sets to be removed. its length must equals to the length of argument x

... other arguments

library(flowWorkspace)
flowDataPath <- system.file("extdata", package = "flowWorkspaceData")
gs <- load_gs(file.path(flowDataPath,"gs_manual"))
gs1 <- gs_clone(gs)
sampleNames(gs1) <- "1.fcs"

# simply the tree
nodes <- gs_get_pop_paths(gs1)
for(toRm in nodes[grepl("CCR", nodes)])
  gs_pop_remove(gs1, toRm)

# remove two terminal nodes
gs2 <- gs_clone(gs1)
sampleNames(gs2) <- "2.fcs"
gs_pop_remove(gs2, "DPT")
gs_pop_remove(gs2, "DNT")

# remove singlets gate
gs3 <- gs_clone(gs2)
gs_pop_remove(gs3, "singlets")
gs_pop_add(gs3, gs_pop_get_gate(gs2, "CD3+"), parent = "not debris")
for(tsub in c("CD4", "CD8"))
  {
    gs_pop_add(gs3, gs_pop_get_gate(gs2, tsub), parent = "CD3+")
    for(toAdd in gs_pop_get_children(gs2, tsub))
    {
        thisParent <- gs_pop_get_parent(gs2[[1]], toAdd,path="auto")
        gs_pop_add(gs3, gs_pop_get_gate(gs2, toAdd), parent = thisParent) 
    }
  }
sampleNames(gs3) <- "3.fcs"

# spin the branch to make it isomorphic
gs4 <- gs_clone(gs3)
# rm cd4 branch first
gs_pop_remove(gs4, "CD4")
# add it back
gs_pop_add(gs4, gs_pop_get_gate(gs3, "CD4"), parent = "CD3+")
# add all the chilren back
for(toAdd in gs_pop_get_children(gs3, "CD4"))
{
    thisParent <- gs_pop_get_parent(gs3[[1]], toAdd)
    gs_pop_add(gs4, gs_pop_get_gate(gs3, toAdd), parent = thisParent)
}
sampleNames(gs4) <- "4.fcs"

gs5 <- gs_clone(gs4)
# add another redundant node
gs_pop_add(gs5, gs_pop_get_gate(gs, "CD4/CCR7+ 45RA+")[[1]], parent = "CD4")
gs_pop_add(gs5, gs_pop_get_gate(gs, "CD4/CCR7+ 45RA-")[[1]], parent = "CD4")
sampleNames(gs5) <- "5.fcs"

library(knitr)
opts_chunk$set(fig.show = 'hold', fig.width = 4, fig.height = 4, results= 'asis')

Remove the redudant leaf/terminal nodes

plot(gs1)
plot(gs2)

Leaf nodes DNT and DPT are redudant for the analysis and should be removed before merging.

Hide the non-leaf nodes

plot(gs2)
plot(gs3)

singlets node is not present in the second tree. But we can't remove it because it will remove all its descendants. We can hide it instead.

invisible(gs_pop_set_visibility(gs2, "singlets", FALSE))
plot(gs2)
plot(gs3)

Note that even gating trees look the same but singlets still physically exists so we must refer the populations by relative path (path = "auto") instead of full path.

gs_get_pop_paths(gs2)[5]
gs_get_pop_paths(gs3)[5]
gs_get_pop_paths(gs2, path = "auto")[5]
gs_get_pop_paths(gs3, path = "auto")[5]

Isomorphism

#restore gs2
invisible(gs_pop_set_visibility(gs2, "singlets", TRUE))
plot(gs3)
plot(gs4)

These two trees are not identical due to the different order of CD4 and CD8. However they are still mergable thanks to the reference by gating path instead of by numeric indices

convenient wrapper for merging

To ease the process of merging large number of batches of experiments, here is some internal wrappers to make it semi-automated.

Grouping by tree structures

gslist <- list(gs1, gs2, gs3, gs4, gs5)
gs_groups <- gs_split_by_tree(gslist)
length(gs_groups)

This divides all the GatingSets into different groups, each group shares the same tree structure. Here we have 4 groups,

Check if the discrepancy can be resolved by dropping leaf nodes

res <- try(gs_check_redundant_nodes(gs_groups), silent = TRUE)
print(res[[1]])

Apparently the non-leaf node (singlets) fails this check, and it is up to user to decide whether to hide this node or keep this group separate from further merging.Here we try to hide it.

for(gp in gs_groups)
  plot(gp[[1]])

Based on the tree structure of each group (usually there aren't as many groups as GatingSet objects itself), we will hide singlets for group 2 and group 4.

for(i in c(2,4))
  for(gs in gs_groups[[i]])
    invisible(gs_pop_set_visibility(gs, "singlets", FALSE))

Now check again with .gs_check_redundant_nodes

toRm <- gs_check_redundant_nodes(gs_groups)
toRm

Based on this, these groups can be consolidated by dropping CCR7+ 45RA+ and CCR7+ 45RA- from group 1. DNT and DPT from group 2.

Sometime it could be difficult to inspect and distinguish tree difference by simply plotting the enire gating tree or looking at this simple flat list of nodes (especially when the entire subtree is missing from cerntain groups). It is helpful to visualize and highlight only the tree hierarchy difference with the helper function gs_plot_diff_tree

gs_plot_diff_tree(gs_groups)

To proceed the deletion of these nodes, .gs_remove_redundant_nodes can be used instead of doing it manually

gs_remove_redundant_nodes(gs_groups, toRm)

Now they can be merged into a single GatingSetList.

GatingSetList(gslist)

Remove the redundant channels from GatingSet

Sometime there may be the extra channels in one data set that prevents it from being merged with other. If these channels are not used by any gates, then they can be safely removed.

gs_remove_redundant_channels(gs1)


Try the flowWorkspace package in your browser

Any scripts or data that you put into this service are public.

flowWorkspace documentation built on Nov. 8, 2020, 8:08 p.m.