mutationCalls | R Documentation |
The 'mutationCalls' dataset contains merged the 22Q2 mutation calls (for coding region, germline filtered) and includes data from 18784 genes, 1771 cell lines, 33 primary diseases and 30 lineages. This dataset can be considered the metadata data set for mutations and does not contain any dependency data. This dataset can be loaded into the R environment with the 'depmap_mutationCalls' function.
mutationCalls
A data frame with 1235466 rows and 32 variables:
depmap_id
Hugo Symbol denotes a unique and meaningful name for each gene (e.g. SAP25)
Gene ID for NCBI Entrez gene database, (e.g. 100316904)
NCBI Build (i.e. reference genome)
Chromosome
Gene start position
Gene end position
Strand location of gene
Variant Classification
Variant Type
Reference Allele
Tumor Seq Allele1
Single Nucleotide Polymorphism Database (dbSNP) reference cluster
dbSNP Val Status
Genome Change
Annotation Transcript
change in cDNA
Codon_Change
Protein_Change
Status of gene knockout on cell lineage
isTCGAhotspot
TCGAhsCnt
isCOSMIChotspot
COSMIChsCnt
ExAC_AF
CGA_WES_AC
SangerWES_AC
RNAseq_AC
HC_AC
RD_AC
WGS_AC
Variant_annotation
This data represents the 'CCLE_mutations.csv' file taken from the 22Q2 [Broad Institute](https://depmap.org/portal/download/) cancer depenedency study. The derived dataset found in the 'depmap' package features the addition of a foreign key 'depmap_id' found in the first column of this dataset, which was added from the 'metadata' dataset. This dataset has been converted to a long format tibble. Variables names from the original dataset were converted to lower case, put in snake case, and abbreviated where feasible.
- 19Q1: Initial dataset for package consisted of dataframe with 1243145 rows and 35 variables representing 18755 genes, 1601 cell lines, 37 primary diseases and 33 lineages.
- 19Q2: adds 30 cell lines, 1 primary disease and 1 lineage. This version has different columns than the previous version: the variable "VA_WES_AC" is no longer present in this dataset. Some minor alterations to the original file were made. The first column of the original dataset, (ID, Sample number) was removed, as this column was only the row number and did not serve any unique identifying purpose.
- 19Q3: adds 1 gene, 25 cell lines and removes 1 primary disease.
- 19Q4: adds 1 gene, 10 cell lines, 0 primary diseases and 2 lineages.
- 20Q1: adds 4 genes, 31 cell lines, 1 lineage.
- 20Q2: adds 44 cell lines, 1 lineage.
- 20Q3: no change.
- 20Q4: removes 13 genes, adds 8 cell lines and 1 lineage. Columns 'tumor_sample_barcode' and 'sanger_recalib_WES_AC' were removed.
- 21Q1: removes 11 genes and 2 cell lines.
- 21Q2: removes 1 genes and adds 3 cell lines.
- 21Q3: removes 3 genes, 4 cell lines and 1 lineage.
- 21Q3: removes 3 genes, 4 cell lines and 1 lineage.
- 21Q4: adds 9 cell lines.
- 22Q1: adds 4 cell lines and 1 lineage. The variable 'tumor_seq_allele1' was renamed 'alt_allele'.
- 22Q2: adds 12 cell lines and removes 2 primary diseases and 8 lineages.
DepMap, Broad Institute: https://depmap.org/portal/download/
Tsherniak, A., Vazquez, F., Montgomery, P. G., Weir, B. A., Kryukov, G., Cowley, G. S., ... & Meyers, R. M. (2017). Defining a cancer dependency map. Cell, 170(3), 564-576.
DepMap, Broad (2019): DepMap Achilles 19Q1 Public. https://figshare.com/articles/DepMap_Achilles_19Q1_Public/7655150
Robin M. Meyers, Jordan G. Bryan, James M. McFarland, Barbara A. Weir, ... David E. Root, William C. Hahn, Aviad Tsherniak. Computational correction of copy number effect improves specificity of CRISPR-Cas9 essentiality screens in cancer cells. Nature Genetics 2017 October 49:1779–1784.
Mahmoud Ghandi, Franklin W. Huang, Judit Jané-Valbuena, Gregory V. Kryukov, ... Todd R. Golub, Levi A. Garraway & William R. Sellers. 2019. Next- generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019).
## Not run:
depmap_mutationCalls()
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.