proteomic: proteomic_20Q2

proteomicR Documentation

proteomic_20Q2

Description

The 'proteomic' dataset contains the 20Q2 quantitative profiling of proteins via mass spectrometry from the Gygi lab. This dataset contains 12399 proteins tested in 375 cell lines, including 24 primary diseases and 27 lineages. The columns of this dataset are: 'depmap_id', a foreign key corresponding to the cancer cell lineage, 'cell_line' the common CCLE name of the cancer cell lines, 'gene_name' containing the HUGO gene name and 'entrez_id' containing only the entrez ID# and 'protein_expression' which contains the normalized protein expression for cancer cell lines. This dataset can be loaded into R environment with the 'depmap_proteomic' function.

Usage

proteomic

Format

A data frame with 24963776 rows (cell lines) and 12 variables:

depmap_id

Cell line foreign key (i.e. "ACH-000956")

cell_line

Name of cancer cell line (i.e. "22RV1_PROSTATE")

gene_name

HUGO symbol (e.g. "TSPAN6")

entrez_id

Ensembl ID (e.g. ENSG00000044574)

protein_expression

normalized protein expression

protein

protein name with TenPx (e.g. MDAMB468_BREAST_TenPx01)

protein_id

Protein ID (e.g. sp|P55011|S12A2_HUMAN)

desc

Description (e.g. S12A2_HUMAN Solute carrier family 12 member 2)

group_id

Group ID

uniprot

Uniprot ID (e.g. S12A2_HUMAN)

uniprot_acc

Uniprot accession ID (e.g. P55011)

TenPx

TenPx number (e.g. TenPx01)

Details

This data originates from the 'protein_quant_current_normalized.csv' file taken from the 20Q2 [Broad Institute](https://depmap.org/portal/download/) cancer depenedency study. The derived dataset found in the 'depmap' package features the addition of a foreign key 'depmap_id' found in the first column of this dataset, which was added from the 'metadata' dataset. This dataset has been converted to a long format tibble. Variables names from the original dataset were converted to lower case, put in snake case, and abbreviated where feasible.

Change log

- 20Q2: Initial dataset consisted of a data frame with 24963776 rows (cell lines) and 12 variables

- 20Q3: no change, no further releases are scheduled at this time.

- 20Q4: no change, no further releases are scheduled at this time.

- 21Q1: no change, no further releases are scheduled at this time.

- 21Q2: no change, no further releases are scheduled at this time.

- 21Q3: no change, no further releases are scheduled at this time.

- 21Q4: no change, no further releases are scheduled at this time.

- 22Q1: no change, no further releases are scheduled at this time.

- 22Q2: no change, no further releases are scheduled at this time.

Source

DepMap, Broad Institute: https://depmap.org/portal/download/

References

David P. Nusinow, John Szpyt, Mahmoud Ghandi, Christopher M. Rose, E. Robert McDonald III, Marian Kalocsay, Judit Jané-Valbuena, Ellen Gelfand, Devin K. Schweppe, Mark Jedrychowski, Javad Golji, Dale A. Porter, Tomas Rejtar, Y. Karen Wang, Gregory V. Kryukov, Frank Stegmeier, Brian K. Erickson, Levi A. Garraway, William R. Sellers, Steven P. Gygi (2020). Quantitative Proteomics of the Cancer Cell Line Encyclopedia. Cell 180, 2.

Examples

## Not run: 
depmap_proteomic()

## End(Not run)


UCLouvain-CBIO/depmap documentation built on Aug. 18, 2024, 9:46 p.m.