knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) options(knitr.kable.NA = '', width = 80) library(flowWorkspace) library(CytoML)
Install necessary packages in RStudio:
install.packages("devtools") devtools::install_github("Close-your-eyes/fcexpr") install.packages("tidyverse")
library(fcexpr) library(tidyverse)
Many information are stored in FlowJos wsp file which is an xml file. This includes gated event counts, statistics like MFI and meta data from the FCS files like keywords. These information can be accessed without a FlowJo dongle - e.g. with the xml2 package in R.
Based on that I wrote functions that extract information from a FlowJo 10 wsp. After you gated your populations, save the wsp and run the fcexpr::wsx_get_popstats function. See an example below.
dir <- base::system.file("extdata", package = "fcexpr") utils::untar(base::system.file("extdata", "Part_3.tgz", package = "fcexpr"), exdir = dir) wd <- file.path(dir, "Part_3") ws <- list.files(file.path(wd, "FJ_workspace"), full.names = T) sd <- openxlsx::read.xlsx(file.path(dir, "Part_3", "sampledescription.xlsx")) ps <- fcexpr::wsx_get_popstats(ws, return_stats = T)
path_to_workspace_on_disk <- "/Users/user/Documents/07-Oct-2021.wsp" ps <- fcexpr::wsx_get_popstats(ws = path_to_workspace_on_disk, return_stats = TRUE)
ps now contains counts from gated populations optionally calculated statistics on index 1 and 2, respectively. You may write the list-content to individual variables. This will allow to inspect the data frames manually in R studio.
ps_counts <- ps[["counts"]] # access by name ps_stats <- ps[[2]] # access by index
Most likely additional information (meta data) are necessary to plot the data or analyze them statistically. You may create such table manually or have it created in a standardized, automatize fashion with fcexpr::sync_sampldescription (this comes a lot of flavour though). Alternatively, your wsp may hold annotations in the form of keywords. These can be extracted as well:
keys <- fcexpr::wsx_get_keywords(ws)
This returns a list of data frames. One entry for every FCS file. In this example there are 48 FCS files. To join the keywords to ps_counts or ps_stats we have to make it a data frame first.
# make it a data.frame keys_df <- dplyr::bind_rows(keys) # create a FileName-column keys_df$FileName <- rep(names(keys), sapply(keys,nrow)) # select relevant keywords; example only: Cytometer name ($CYT) and Operator ($OP) keys_df <- dplyr::filter(keys_df, name %in% c("$CYT", "$OP")) # dplyr-style # make it a wider data.frame #pivot_wider and pivot_longer are a bit difficult to get your head around but are very powerful keys_df <- tidyr::pivot_wider(keys_df, names_from = name, values_from = value)
Now we have one data frame of keywords for every FCS file. Join these meta data to the phenotype data. CYT and OP are just two examples but are not very informative. Hence, we also join another table (sd) with more relevant meta data about the conducted transduction experiment.
ps_counts <- dplyr::left_join(ps_counts, keys_df) # sd (sampledescription) is data frame with meta data names(sd) ps_counts <- dplyr::left_join(ps_counts, sd)
ps_counts which now holds meta data and phenotype data may be saved to excel for further processing.
openxlsx::write.xlsx(ps_counts, "/Users/user/Documents/7-Oct-2021.xlsx")
It is preferable though to tidy up, filter, select and plot in R. This can be done with base R or with functions from the tidyverse. It requires a little bit of practice. For a number of cases though there are a few standard procedures that you may learn quickly. Some arbitrary operations and plots are shown below.
# filter for the GFP+ population only ps_counts <- ps_counts %>% dplyr::filter(Population == "GFP+") # plot the frequency of GFP+ events with respect to the parent gate ggplot(ps_counts, aes(x = incubation.time.h, y = FractionOfParent)) + geom_jitter(width = 0.1) + theme_bw() + ylab("% GFP+ in DAPI-")
ps_counts$incubation.time.h <- factor(ps_counts$incubation.time.h, levels = sort(as.numeric(unique(ps_counts$incubation.time.h)))) ps_counts$EGF.ng.ml <- factor(ps_counts$EGF.ng.ml, levels = sort(as.numeric(unique(ps_counts$EGF.ng.ml)))) ps_counts$seeding.density.dilution <- factor(ps_counts$seeding.density.dilution, levels = sort(as.numeric(unique(ps_counts$seeding.density.dilution)))) ps_counts$viral.sup.vol.µl <- factor(ps_counts$viral.sup.vol.µl, levels = sort(as.numeric(unique(ps_counts$viral.sup.vol.µl))))
The incubation time of viral supernatant (x-axis) seems to play a role for the transduction rate (y-axis). But there are more factors which can be visualized in another plot.
ggplot(ps_counts, aes(x = viral.sup.vol.µl, y = seeding.density.dilution, fill = FractionOfParent)) + geom_tile(color = "black") + theme_classic() + scale_fill_viridis_c() + facet_grid(cols = vars(incubation.time.h), rows = vars(FCS.percent))
unlink(wd, recursive = T)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.