R/leduc2022_pSCoPE.R

##' Leduc et al. 2022 - pSCoPE (biorRxiv): melanoma cells vs monocytes
##'
##' Single cell proteomics data acquired by the Slavov Lab. This is
##' the dataset associated to the third version of the preprint. It
##' contains quantitative information of melanoma cells and monocytes
##' at PSM, peptide and protein level. This version of the data was
##' acquired using the pSCoPE MS acquisition approach.
##'
##' @format A [QFeatures] object with 138 assays, each assay being a
##' [SingleCellExperiment] object:
##'
##' - Assay 1-134: PSM data acquired with a TMT-18plex protocol, hence
##'   those assays contain 18 columns. Columns hold quantitative
##'   information from single-cell channels, carrier channels,
##'   reference channels, empty (negative control) channels and
##'   unused channels.
##' - `peptides`: peptide data containing quantitative data for 20,804
##'   peptides and 1556 single-cells. These data have been filtered
##'   to keep high-quality PSMs, all batches have been normalized to
##'   the reference channel, PSMs were aggregated to peptides, and
##'   single-cells with low median coefficient of variation were kept.
##' - `peptides_log`: peptide data containing quantitative data for
##'   12,284 peptides and 1543 single-cells. The `peptides` data was
##'   further normalized, highly missing peptides were removed and the
##'   quantifications were log-transformed.
##' - `proteins_norm2`: protein data containing quantitative data for
##'   2844 proteins and 1543 single-cells. The peptides from
##'   `peptides_log` were aggregated to proteins and normalized.
##' - `proteins_processed`: protein data containing quantitative data
##'   for 2844 proteins and 1543 single-cells. The `proteins_norm2`
##'   data were imputed, batch corrected and normalized.
##'
##' The `colData(leduc2022_pSCoPE())` contains cell type annotation,
##' LC batch information, the TMT label, the MS run ID. We also added
##' the sample prep annotations provided by the cellenONE dispensing
##' device (only for single cells): time stamp of cell isolation by the
##' device, the diameter and elongation of the cell, the ID of the
##' sample glass side (4 slides in total), the field within the glass
##' (each slide is divided in 4 field), the pooled well ID (each field
##' contains 9 pools), the x and y coordinates of each cell dropped in
##' a field and of each cell pool upon pickup. Finally, we also
##' retrieved the melanoma subpopulation generated by the authors upon
##' data analysis. The main population is encoded as `A` while the
##' small population is encoded `B`. The description of the `rowData`
##' fields for the PSM data can be found in the
##' [`MaxQuant` documentation](http://www.coxdocs.org/doku.php?id=maxquant:table:evidencetable).
##'
##' @section Acquisition protocol:
##'
##' The data were acquired using the following setup. More information
##' can be found in the source article (see `References`).
##'
##' - **Cell isolation**: CellenONE cell sorting.
##' - **Sample preparation** performed using the improved SCoPE2
##'   protocol using the CellenONE liquid handling system. nPOP cell
##'   lysis (DMSO) + trypsin digestion + TMT-18plex
##'   labeling and pooling. A target library was generated as well to
##'   perform prioritized DDA (Huffman et al. 2022) using MaxQuant.Live
##'   (2.0.3).
##' - **Separation**: online nLC (DionexUltiMate 3000 UHPLC with a
##'   25cm x 75um IonOpticks Aurora Series UHPLC column; 200nL/min).
##' - **Ionization**: ESI (1,800V).
##' - **Mass spectrometry**: Thermo Scientific Q-Exactive (MS1
##'   resolution = 70,000; MS2 accumulation time = 300ms; MS2
##'   resolution = 70,000). Prioritized data acquisition was performed
##'   using the pSCoPE protocol (Huffman et al. 2022)
##' - **Data analysis**: MaxQuant (1.6.17.0) + DART-ID
##'
##' @section Data collection:
##'
##' The PSM data were collected from a shared Google Drive folder that
##' is accessible from the SlavovLab website (see `Source` section).
##' The folder contains the following files of interest:
##'
##' - `ev_updated.txt`: the MaxQuant/DART-ID output file
##' - `annotation.csv`: sample annotation
##' - `batch.csv`: batch annotation
##' - `t0.csv`: the processed data table containing the `peptides` data
##' - `t3.csv`: the processed data table containing the `peptides_log`
##'   data
##' - `t4b.csv`: the processed data table containing the
##'   `proteins_norm2` data
##' - `t6.csv`: the processed data table containing the
##'   `proteins_processed` data
##'
##' We combined the sample annotation and the batch annotation in
##' a single table. We also formatted the quantification table so that
##' columns match with those of the annotations. Both annotation and
##' quantification tables are then combined in a single [QFeatures]
##' object using the [scp::readSCP()] function.
##'
##' The 4 CSV files were loaded and formatted as [SingleCellExperiment]
##' objects and the sample metadata were matched to the column names
##' (mapping is retrieved after running the author's original R script)
##' and stored in the `colData`.
##' The object is then added to the [QFeatures] object (containing the
##' PSM assays) and the rows of the peptide data are linked to the
##' rows of the PSM data based on the peptide sequence information
##' through an `AssayLink` object.
##'
##' @source
##' The data were downloaded from the
##' [Slavov Lab](https://scp.slavovlab.net/Leduc_et_al_2022) website.
##' The raw data and the quantification data can also be found in the
##' massIVE repository `MSV000089159`:
##' ftp://massive.ucsd.edu/MSV000089159.
##'
##' @references
##' Andrew Leduc, Gray Huffman, and Nikolai Slavov. 2022. “Droplet
##' Sample Preparation for Single-Cell Proteomics Applied to the Cell
##' Cycle.” bioRxiv. [Link to article](https://doi.org/10.1101/2021.04.24.441211)
##'
##' Gray Huffman, Andrew Leduc, Christoph Wichmann, Marco di Gioia,
##' Francesco Borriello, Harrison Specht, Jason Derks, et al. 2022.
##' “Prioritized Single-Cell Proteomics Reveals Molecular and
##' Functional Polarization across Primary Macrophages.” bioRxiv.
##' [Link to article](https://doi.org/10.1101/2022.03.16.484655).
##'
##' @seealso
##' [leduc2022_plexDIA]
##'
##' @examples
##' \donttest{
##' leduc2022_pSCoPE()
##' }
##'
##' @keywords datasets
##'
"leduc2022_pSCoPE"
UCLouvain-CBIO/scpdata documentation built on May 6, 2024, 6:17 a.m.