Krieg_Anti_PD_1 | R Documentation |
Mass cytometry (CyTOF) dataset from Krieg et al. (2018), consisting of 20 baseline samples (prior to treatment) of peripheral blood from melanoma skin cancer patients subsequently treated with anti-PD-1 immunotherapy. The samples are split across 2 conditions (non-responders and responders) and 2 batches. This dataset can be used to benchmark differential analysis algorithms used to test for differentially abundant rare cell populations.
Krieg_Anti_PD_1_SE(metadata = FALSE)
Krieg_Anti_PD_1_flowSet(metadata = FALSE)
metadata |
|
This is a mass cytometry (CyTOF) dataset from Krieg et al. (2018), who used mass cytometry to characterize immune cell subsets in peripheral blood from melanoma skin cancer patients treated with anti-PD-1 immunotherapy. This study found that the frequency of CD14+CD16-HLA-DRhi monocytes in baseline samples (taken from patients prior to treatment) was a strong predictor of survival in response to immunotherapy treatment. In particular, the frequency of a small subpopulation of CD14+CD33+HLA-DRhiICAM-1+CD64+CD141+CD86+CD11c+CD38+PD-L1+CD11b+ monocytes in baseline samples was strongly associated with responder status following immunotherapy treatment. Note that this dataset contains a strong batch effect, due to sample acquisition on two different days (Krieg et al., 2018).
This dataset can be used to benchmark differential analysis algorithms used to test for differentially abundant rare cell populations (i.e. the small subpopulation of CD14+CD33+HLA-DRhiICAM-1+CD64+CD141+CD86+CD11c+CD38+PD-L1+CD11b+ monocytes).
The dataset contains 20 baseline samples (i.e. samples taken prior to treatment), from patients subsequently classified into 2 groups (9 non-responders and 11 responders). Samples are also split across 2 batches ('batch23' and 'batch29'), due to sample acquisition on two different days. The total number of cells is 85,715.
There are 24 'cell type' markers used to characterize cell subpopulations. (One additional
cell type marker – CD45 – is also available, but should be excluded from most analyses since
almost all cells show very high expression of CD45; so it does not help distinguish
subpopulations, and may dominate other signals. Therefore, CD45 has been classified as 'none'
in the marker_info
table.)
The dataset is provided in two Bioconductor object formats: SummarizedExperiment
and flowSet
. In each case, cells are stored in rows, and protein markers in
columns (this is the usual format used for flow and mass cytometry data).
For the link{SummarizedExperiment}
, row and column metadata can be accessed with the
rowData
and colData
accessor functions from the
SummarizedExperiment
package. The row data contains group IDs, batch IDs, and
sample IDs. The column data contains channel names, protein marker names, and a
factor marker_class
to identify the class of each protein marker ('cell type', 'cell state',
as well as 'none' for any non protein marker columns that are not needed for downstream analyses;
for this dataset, all proteins are cell type markers). The expression values for each cell can be
accessed with assay
. The expression values are formatted as a single table.
For the flowSet
, the expression values are stored in a separate table for each
sample. Each sample is represented by one flowFrame
object within the overall
flowSet
. The expression values can be accessed with the exprs
function from the
flowCore
package. Row metadata is stored as additional columns of data within
the flowFrame
for each sample; note that factor values are converted to numeric values,
since the expression tables must be numeric matrices. Channel names are stored in the column names
of the expression tables. Additional row and column metadata is stored in the description
slots, which can be accessed with the description
accessor function for the
individual flowFrames
; this includes filenames, additional sample information, and
additional marker information.
Prior to performing any downstream analyses, the expression values should be transformed.
A standard transformation used for mass cytometry data is the asinh
with
cofactor = 5
.
File sizes: 12.3 MB (SummarizedExperiment
), 12.6 MB (flowSet
).
Original source: Krieg et al. (2018): https://www.ncbi.nlm.nih.gov/pubmed/29309059
Original link to raw data (FlowRepository, FR-FCM-ZY34): http://flowrepository.org/id/FR-FCM-ZY34
This dataset was previously used to benchmark algorithms for differential analysis in our article, Weber et al. (2019): https://www.ncbi.nlm.nih.gov/pubmed/31098416. (For additional details on the dataset, see Supplementary Note 1: Benchmark datasets.)
Data files are also available from FlowRepository (FR-FCM-ZYL8): http://flowrepository.org/id/FR-FCM-ZYL8
Returns a SummarizedExperiment
or flowSet
object.
Krieg et al. (2018), "High-dimensional single-cell analysis predicts response to anti-PD-1 immunotherapy." Nature Medicine, 24, 144-153: https://www.ncbi.nlm.nih.gov/pubmed/29309059
Weber et al. (2019). "diffcyt: Differential discovery in high-dimensional cytometry via high-resolution clustering." Communications Biology, 2:183: https://www.ncbi.nlm.nih.gov/pubmed/31098416
Krieg_Anti_PD_1_SE()
Krieg_Anti_PD_1_flowSet()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.