readme.md

T Cell Cytometery Analysis

Immdonor is a GraphQL API of Immport Organ Donor metadata hosted on Hasura Cloud. This API lets you request exactly what data you need from the Immport data model and returns a JSON output which can be converted to a R dataframe!

This is a fast way to link Flow Cytometry Standard (.FCS) files to their metadata and enables more robust analysis with Cytoverse. You will need to have the FCS files saved locally (see quick_fetch.R).

Connect to API

You can connect using your preferred GraphQL R client. Here we use rOpenSci’s ghql package

library(ghql)

#By connecting to this API you agree to the Immport.org Terms of Service
con <- GraphqlClient$new(url = "https://resolved-lab-57.hasura.app/v1/graphql")

GET

qry <- Query$new()

studies<- qry$query('studies','{
  shared_data_study {
    study_accession
    actual_enrollment
    minimum_age
    maximum_age
  }
}')
x <- con$exec(qry$queries$studies)

ANSWER

studies<- as.data.frame (jsonlite::fromJSON(x))
study\_accession actual\_enrollment minimum\_age maximum\_age SDY702 56 3.00 73.00 SDY1041 20 21.00 70.00 SDY1097 43 .20 63.00 SDY1389 51 9.00 76.00

Organ Donor Metadata

The studies listed above feature cytometry data generously donated by organ donors. We can build queries to return exactly what metadata we would like to know about these organ donors.

qry <- Query$new()

files<- qry$query('files','{
  resultFiles {
    filePath
    studyAccession
    subjectAccession
    maxSubjectAge
    gender
    biosampleType
    parameters
    panel_id
    markers

  }
}')

x <- con$exec(qry$queries$files)
files<- as.data.frame (jsonlite::fromJSON(x))

Biosample types

Primary Lymphoid Organs Count Bone Marrow 150 Thymus 31 Secondary Lymphoid Organs Count Inguinal lymph node 102 Lung lymph node 246 Mesenteric lymph node 120 Spleen 182 Tonsil 12 Mucosal Count Colon 76 Ileum 61 Jejunum 58 Lung 130 PBMC 136 Whole blood 31

Panels

Here are the top 5 staining panels with the most biosamples

panel\_id parameters markers n FCM-19 14 FITC-A-CD25|Qdot 605-A-CD45RA|Qdot 655-A-CD3|QDot 705-A-CD127|APC-A-CD19|PE-Cy7-A-CD4|PE-A-FOXP3 81 FCM-56 20 APC-A-CMV|BV570-A-CD3|Qdot 655-A-CD4|Qdot 605 UV-A-CD8|Qdot 605-A-CD45RA|Alexa Fluor 488-A-CCR7|QDot 705-A-NA|BV421-A-CD95|Alexa Fluor 700-A-INFg|PerCP-Cy5-5-A-TNFa|PE-Cy7-A-IL2|PE-A-CD107a|DAPI-A-DEAD 78 FCM-21 18 FITC-A-CD25|PE-Cy5-A-CTLA-4|BV421-A-CD69|Qdot 605-A-CD45RA|Qdot 655-A-CD3|QDot 705-A-CD127|APC-A-CD103|Alexa Fluor 700-A-KI67|APC-Cy7-A-CD8|PE-Cy7-A-CD4|PE-A-FOX-p3 57 FCM-1 14 Alexa Fluor 488-A-CCR7|PerCP-Cy5-5-A-CD45RO|BV421-A-CD69|Qdot 605-A-CD45RA|Qdot 655-A-CD3|QDot 705-A-CD127|APC-A-CD31|APC-Cy7-A-CD8|PE-Texas Red-A-CD19|PE-Cy7-A-CD4|PE-A-CD28 54 FCM-2 18 Alexa Fluor 488-A-CCR7|PerCP-Cy5-5-A-CD45RO|BV421-A-CD69|Qdot 605-A-CD45RA|Qdot 655-A-CD3|QDot 705-A-CD127|APC-A-CD31|APC-Cy7-A-CD8|PE Alexa Fluor 610-A-CD19|PE-Cy7-A-CD4|PE-A-CD28 52

T Cell Surface Panel Analysis

panel\_id parameters markers n FCM-2 18 Alexa Fluor 488-A-CCR7|PerCP-Cy5-5-A-CD45RO|BV421-A-CD69|Qdot 605-A-CD45RA|Qdot 655-A-CD3|QDot 705-A-CD127|APC-A-CD31|APC-Cy7-A-CD8|PE Alexa Fluor 610-A-CD19|PE-Cy7-A-CD4|PE-A-CD28 52 FCM-3 18 Alexa Fluor 488-A-NA|PerCP-Cy5-5-A-NA|BV421-A-NA|Qdot 605-A-NA|Qdot 655-A-NA|QDot 705-A-NA|APC-A-NA|APC-Cy7-A-NA|PE Alexa Fluor 610-A-NA|PE-Cy7-A-NA|PE-A-NA 27

Link cytoset to metadata

In order to use all of OpenCyto’s features like collapseDataForGating we need to annotate the donor metadata into the cytoset’s pData.

# Load cytoset 
fcs<- files %>% filter(data.resultFiles.panel_id == "FCM-2"|data.resultFiles.panel_id == "FCM-3")
fn<- fcs$data.resultFiles.filePath
cs<- load_cytoset_from_fcs(files = fn)

#Annotate immdonor metadata to pData
p<- pData(cs)
m<- cbind(p,fcs)
pData(cs)<- m

# Harmonize marker names
channels <- colnames(cs)
markers <- as.vector(pData(parameters(cs[[1]]))$desc)
names(markers)<- channels
markernames(cs) <- markers

Compensate and Transform

# Apply file internal compensation
comps <- lapply(cs, function(cf) spillover(cf)$SPILL)
cs_comp <- compensate(cs, comps)

# Transform fluorescent channels with FCSTrans
channels_to_exclude <- c(grep(colnames(cs), pattern="FSC"),
                         grep(colnames(cs), pattern="SSC"),
                         grep(colnames(cs), pattern="Time"))
chnls <- colnames(cs)[-channels_to_exclude]

fcstrans<- FCSTransTransform(transformationId = "defaultFCSTransTransform", channelrange = 2^18, channeldecade = 4.5, range = 4096, cutoff = -111, w = 0.5, rescale = TRUE)
transList <- transformList(chnls, fcstrans)

cs_trans<- transform(cs_comp,transList)

cs<- save_cytoset(cs_trans, "cytosets/tcell)

Build Autogating strategy

Since we annotated the cytoset with each file’s age, gender and biosample type we can use the collapseDataForGating feature and groupby biosampleType to improve our autogating.

cs<- load_cytoset(path = "cytosets/tcell")
gs<- GatingSet(cs)

gs_add_gating_method(gs, alias = "nonDebris",
                     pop = "+",
                     parent = "root",
                     dims = "FSC-A",
                     gating_method = "mindensity",
                     gating_args = "min = 20000, max=50000",
                     collapseDataForGating = "TRUE",
                     groupBy = "data.resultFiles.biosampleType") 
## ...
## Warning in .gating_gtMethod(x, y, ...): NAs introduced by coercion
## done

gs_add_gating_method(gs, alias = "singlets",
                     pop = "+",
                     parent = "nonDebris",
                     dims = "FSC-A,FSC-H",
                     gating_method = "singletGate")
## ...
## done


gs_add_gating_method(gs, alias = "bcells",
                     pop = "+/-",
                     parent = "singlets",
                     dims = "CD19",
                     gating_method = "mindensity",
                     collapseDataForGating = "TRUE",
                     groupBy = "data.resultFiles.biosampleType")
## ...
## Warning in .gating_gtMethod(x, y, ...): NAs introduced by coercion
## done

gs_add_gating_method(gs, alias = "CD4",
                     pop = "+/-+/-",
                     parent = "CD19-",
                     dims = "CD3,CD4",
                     gating_method = "mindensity",
                     gating_args = "min = 1500, max=2500",
                     collapseDataForGating = "TRUE",
                     groupBy = "data.resultFiles.biosampleType")
## ...
## Warning in .gating_gtMethod(x, y, ...): NAs introduced by coercion

## Warning in .gating_gtMethod(x, y, ...): NAs introduced by coercion
## done

gs_add_gating_method(gs, alias = "Tmem",
                     pop = "+/-+/-",
                     parent = "CD4+",
                     dims = "CCR7,CD45RA",
                     gating_method = "mindensity",
                     gating_args = "min = 1750, max=2500",
                     collapseDataForGating = "TRUE",
                     groupBy = "data.resultFiles.biosampleType")
## ...
## Warning in .gating_gtMethod(x, y, ...): NAs introduced by coercion

## Warning in .gating_gtMethod(x, y, ...): NAs introduced by coercion
## done


lln<- subset(gs, data.resultFiles.biosampleType == "Lung lymph node")
lung<- subset(gs, data.resultFiles.biosampleType == "Lung")
spleen<- subset(gs, data.resultFiles.biosampleType == "Spleen")
blood<- subset(gs, data.resultFiles.biosampleType == "Whole blood")

CD3+CD4 T cells

Lung Lymph node

ggcyto(lln, aes(x = "CD4", y = "CD3")) + geom_gate("CD4+") + geom_hex(bins = 128)+ ggcyto_par_set(limits = "instrument")
## Coordinate system already present. Adding new coordinate system, which will replace the existing one.

Lung

ggcyto(lung, aes(x = "CD4", y = "CD3")) + geom_gate("CD4+") + geom_hex(bins = 128)+ ggcyto_par_set(limits = "instrument")
## Coordinate system already present. Adding new coordinate system, which will replace the existing one.

Spleen

ggcyto(spleen, aes(x = "CD4", y = "CD3")) + geom_gate("CD4+") + geom_hex(bins = 128)+ ggcyto_par_set(limits = "instrument")
## Coordinate system already present. Adding new coordinate system, which will replace the existing one.

CD4 memory subsets

Lung Lymph Node

ggcyto(gs_pop_get_data(lln ,"CD4+"), aes(x = "CCR7", y = "CD45RA")) + geom_hex(bins = 128)

Lung

ggcyto(gs_pop_get_data(lung ,"CD4+"), aes(x = "CCR7", y = "CD45RA")) + geom_hex(bins = 128)

Spleen

ggcyto(gs_pop_get_data(spleen ,"CD4+"), aes(x = "CCR7", y = "CD45RA")) + geom_hex(bins = 128)

Metacyto analysis

work in progress…

fcs<- files %>% filter(data.resultFiles.panel_id == "FCM-2"|data.resultFiles.panel_id == "FCM-3")

fcs<- fcs %>%
  dplyr::filter(data.resultFiles.biosampleType == "Lung lymph node"|data.resultFiles.biosampleType == "Lung"|data.resultFiles.biosampleType == "Spleen"|data.resultFiles.biosampleType == "PBMC"|data.resultFiles.biosampleType == "Whole blood"|data.resultFiles.biosampleType == "Inguinal lymph node"|data.resultFiles.biosampleType == "Colon"|data.resultFiles.biosampleType == "Ileum"|data.resultFiles.biosampleType == "Jejunum")

fcs$fcs_files<- fcs$data.resultFiles.filePath
fcs$study_id <- fcs$data.resultFiles.biosampleType

fcs_info<- fcs %>% 
  dplyr::select(fcs_files, study_id)

sample_info <- fcs %>%
  dplyr::select(fcs_files, data.resultFiles.maxSubjectAge,data.resultFiles.biosampleType, data.resultFiles.gender)

preprocessing.batch(inputMeta = fcs_info,
                    assay = "FCM",
                    outpath = "metacyto/panel/preprocess_output",
                    b = 1/150,
                    excludeTransformParameters=c("FSC-A","FSC-W","FSC-H","SSC-A","SSC-W","SSC-H","Time"))
## Study ID =  Colon  Preprocessing 
## Study ID =  Ileum  Preprocessing 
## Study ID =  Jejunum  Preprocessing 
## Study ID =  Whole blood  Preprocessing 
## Study ID =  Lung lymph node  Preprocessing 
## Study ID =  Inguinal lymph node  Preprocessing 
## Study ID =  Lung  Preprocessing 
## Study ID =  Spleen  Preprocessing 
## Study ID =  PBMC  Preprocessing 
## Preprocess result stored in the folder: metacyto/panel/preprocess_output

files=list.files("metacyto/panel",pattern="processed_sample",recursive=TRUE,full.names=TRUE)

cs<- load_cytoset(path = "cytosets/tcell")
new<- markernames(cs)
channels_to_exclude <- c(grep(colnames(cs), pattern="FSC"),
                         grep(colnames(cs), pattern="SSC"),
                         grep(colnames(cs), pattern="Time"))
old <- colnames(cs)[-channels_to_exclude]
old<- toupper(old)
oldplus<-paste0(old,"+")
oldminus<-paste0(old,"-")

nameUpdator(oldNames=old, newNames=new, files=files)
nameUpdator(oldNames=oldplus, newNames=new, files=files)
nameUpdator(oldNames=oldminus, newNames=new, files=files)


#define parameters that we don't want to cluster
excludeClusterParameters=c("FSC-A","FSC-W","FSC-H","SSC-A","SSC-W","SSC-H","Time")

cluster_label=autoCluster.batch(preprocessOutputFolder="metacyto/panel/preprocess_output",
                                excludeClusterParameters=excludeClusterParameters,
                                labelQuantile=0.95,
                                minPercent = 0.05,
                                clusterFunction=flowSOM.MC)
## Clustering , study ID =  Colon
## Building SOM
## Mapping data to SOM
## Building MST
## Clustering , study ID =  Ileum
## Building SOM
## Mapping data to SOM
## Building MST
## Clustering , study ID =  Jejunum
## Building SOM
## Mapping data to SOM
## Building MST
## Clustering , study ID =  Whole blood
## Building SOM
## Mapping data to SOM
## Building MST
## Clustering , study ID =  Lung lymph node
## Building SOM
## Mapping data to SOM
## Building MST
## Clustering , study ID =  Inguinal lymph node
## Building SOM
## Mapping data to SOM
## Building MST
## Clustering , study ID =  Lung
## Building SOM
## Mapping data to SOM
## Building MST
## Clustering , study ID =  Spleen
## Building SOM
## Mapping data to SOM
## Building MST
## Clustering , study ID =  PBMC
## Building SOM
## Mapping data to SOM
## Building MST

searchCluster.batch(preprocessOutputFolder="metacyto/panel/preprocess_output",
                    outpath="metacyto/panel/search_output",
                    clusterLabel=cluster_label)
## Searching , study ID =  Colon
## Searching , study ID =  Ileum
## Searching , study ID =  Jejunum
## Searching , study ID =  Whole blood
## Searching , study ID =  Lung lymph node
## Searching , study ID =  Inguinal lymph node
## Searching , study ID =  Lung
## Searching , study ID =  Spleen
## Searching , study ID =  PBMC

files=list.files("metacyto/panel/search_output",pattern="cluster_stats_in_each_sample",recursive=TRUE,full.names=TRUE)
fcs_stats=collectData(files,longform=TRUE)


# join the cluster summary statistics with sample information
all_data=inner_join(fcs_stats,sample_info,by="fcs_files")

# See the fraction of what clusters are affected by Gender (while controlling for age)
GA=glmAnalysis(value="value",
               variableOfInterst="data.resultFiles.gender",
               parameter="fraction", 
               otherVariables=c("data.resultFiles.maxSubjectAge"),
               studyID="study_id",label="label",
               data=all_data,CILevel=0.95,ifScale=c(TRUE,FALSE))

GA=GA[order(GA$`Effect_size`),]

GA$`label`=as.character(GA$`label`)
w = which((GA$`label`)>15)
## Warning: Use of `GA$label` is discouraged. Use `label` instead.

## Warning: Use of `GA$Effect_size` is discouraged. Use `Effect_size` instead.

Top 20 largest effect size

label Effect\_size SE t\_value p\_value lower upper N 47 CD28+|CD3+|CD4+|CD45RO+ 0.8031651 0.2278789 3.524526 0.0008629 0.3464856 1.2598446 66 184 CD28+|CD4+|CD45RO+ 0.7426466 0.2428331 3.058260 0.0034352 0.2559982 1.2292949 66 85 CCR7+|CD19+|CD31+|CD4+|CD45RO+|CD8+ 0.7358015 0.2545410 2.890700 0.0056343 0.2247892 1.2468138 61 128 CCR7+|CD19+|CD3+|CD31+|CD4+|CD45RO+|CD8+ 0.7305022 0.2551306 2.863248 0.0060710 0.2183061 1.2426982 61 205 CD4+|CD45RO+ 0.7237409 0.2444161 2.961102 0.0045181 0.2339201 1.2135616 66 203 CD28+|CD3+|CD4+|CD45RO+|CD8+ 0.7039717 0.2343360 3.004113 0.0041215 0.2335226 1.1744207 61 30 CCR7+|CD19+ 0.6178449 0.2168472 2.849217 0.0061552 0.1832733 1.0524165 66 78 CCR7+|CD19+|CD45RO+ 0.6075617 0.2304315 2.636626 0.0108627 0.1457666 1.0693569 66 103 CCR7+|CD19+|CD31+|CD4+|CD45RO+ 0.5967158 0.2418360 2.467439 0.0167492 0.1120655 1.0813660 66 11 CD127-|CD19+|CD4+|CD45RO+|CD8+ 0.5930820 0.2192813 2.704663 0.0095541 0.1516916 1.0344725 55 104 CCR7+|CD19+|CD31+|CD4+ 0.5903663 0.2410285 2.449363 0.0175242 0.1073343 1.0733983 66 132 CCR7+|CD3+|CD45RO+ 0.5903451 0.2647291 2.229997 0.0298485 0.0598160 1.1208741 66 74 CCR7+|CD19+|CD28+|CD31+|CD4+|CD45RA+|CD45RO+|CD8+ 0.5826280 0.2760517 2.110576 0.0397328 0.0284311 1.1368248 61 213 CD19+|CD3+|CD45RA+|CD45RO+ 0.5573624 0.2245925 2.481660 0.0161614 0.1072689 1.0074559 66 151 CD4+ 0.5528965 0.2464158 2.243754 0.0288942 0.0590682 1.0467248 66 17 CCR7+|CD19+|CD28+|CD45RO+|CD8+ 0.5446699 0.2295150 2.373134 0.0214444 0.0838993 1.0054405 61 95 CCR7+|CD19+|CD28+|CD31+|CD4+|CD45RA+|CD8+ 0.5394233 0.2465378 2.187994 0.0329366 0.0453505 1.0334961 66 120 CCR7+|CD19+|CD28+|CD3+|CD4+|CD45RA+ 0.5315369 0.2320631 2.290484 0.0258510 0.0664721 0.9966016 66 136 CD19+|CD45RA+ 0.5297685 0.2182681 2.427146 0.0185209 0.0923495 0.9671875 66 226 CCR7+|CD19+|CD28+|CD3+|CD31+|CD4+|CD45RA+|CD45RO+|CD69+|CD8+ 0.5258602 0.2900366 1.813083 0.0763471 \-0.0579531 1.1096735 55

mrj31/immdonor documentation built on Nov. 1, 2021, 4:29 a.m.