knitr::opts_chunk$set(dpi = 300) knitr::opts_chunk$set(cache = FALSE) knitr::opts_chunk$set(fig.width = 6, fig.height = 6, width=8)
library(TCGAbiolinks) library(MMRFBiolinks) library(SummarizedExperiment) library(dplyr) library(DT)
- **GDC Data Portal**: MMRF-Commpass information about data available at GDC Data Portal can be retrieved through *MMRFGDC_QuerySummary* function. MMRF-Commpass data can be queried and downloaded using rispectively *TCGABiolinks::GDCquery* and *TCGABiolinks::GDCdownload* belonging to [TCGAbiolinks package](https://bioconductor.org/packages/TCGAbiolinks/).
- **MMRF-CoMMpass Researcher Gateway**: MMRF-CoMMpass data can be directly downloaded from MMRF-Commpass Researcher Gateway that has more information compared to GDC Data Portal data (e.g Best overall response). So the previous one is only a subset of this last. Once logged, the user can download data from MMRF-CoMMpass Researcher Gateway and import them as a dataframe into R environment for the further analysis.
MMRFqueryGDC\_Summary allows to get information (including the number of cases) about query obteined from GDCquery function belonging to TCGABiolinks package.
query.mm <- GDCquery(project = "MMRF-COMMPASS", data.category = "Transcriptome Profiling", data.type = "Gene Expression Quantification", workflow.type = "HTSeq - FPKM")
# Download GDCdownload(query.mm)
summary<-MMRFGDC_QuerySummary(query.mm) # Only first 100 to make faster datatable(summary, rownames = TRUE)
The useful arguments for searching MMRF-CoMMpass Project data are:
Arguments | Description -----|----- Data.category| A valid data category in the list below Data.type| A data type to filter the files to download Workflow.type| GDC workflow type Access| Filter by access type. Possible values: controlled experimental_strategy| Filter to experimental strategy barcode| A list of barcodes to filter the files to download sample.type| A sample type to filter the files to download
The arguments options for filtering MMRF-COMMPASS data are listed below:
| Data.category | Data.type | Workflow Type | Access | experimental_strategy
|----------------------------- | ----------------------------------- | ------------------------------- | -------------------- |------------------------ |
| Transcriptome Profiling | Gene Expression Quantification | HTSeq - Counts | Open / Controlled | RNA-Seq |
| | | HTSeq - FPKM-UQ | Open / Controlled | RNA-Seq |
| | | HTSeq - FPKM | Open / Controlled | RNA-Seq |
| | | STAR - Counts | Open / Controlled | RNA-Seq |
| | Splice Junction Quantification | STAR - Counts | Open / Controlled | RNA-Seq |
| Simple Nucleotide Variation | Raw Simple Somatic Mutation | MuSE | Controlled | WXS |
| | | SomaticSniper | Controlled | WXS |
| | | VarScan2 | Controlled | WXS |
| | | Pindel | Controlled | WXS |
| | | MuTect2 | Controlled | WXS |
| | Annotated Somatic Mutation | MuSE Annotation | Controlled | WXS |
| | | VarScan2 Annotation | Controlled | WXS |
| | | Pindel Annotation | Controlled | WXS |
| | | MuTect2 Annotation | Controlled | WXS |
| | | SomaticSniper Annotation | Controlled | WXS |
| Sequencing Reads | Aligned Reads | BWA with Mark Duplicates and BQSR | Controlled | WXS / RNA-Seq / WGS |
| | | STAR 2-Pass Genome | Controlled | WXS / RNA-Seq / WGS |
| | | STAR 2-Pass Transcriptome | Controlled | WXS / RNA-Seq / WGS |
Below we provide further details about Sample.type an Barcode arguments:
The options for the sample.type
field in MMRF-COMPASS Project are:
sample_type.code|sample_type.def -----|----- TRBM|Recurrent Blood Derived Cancer - Bone Marrow TBM|Primary Blood Derived Cancer - Bone Marrow NB|Blood Derived Normal TB|Primary Blood Derived Cancer - Peripheral Blood TRB|Recurrent Blood Derived Cancer - Peripheral Blood
Example:
#library(DT) query.mm<-GDCquery(project = "MMRF-COMMPASS", data.category = "Transcriptome Profiling", data.type = "Gene Expression Quantification", workflow.type="HTSeq - FPKM", sample.type="Primary Blood Derived Cancer - Peripheral Blood")
getResults(query.mm, cols = c("cases.submitter_id","sample_type","cases")) %>% DT::datatable(options = list(scrollX = TRUE, keys = TRUE))
Example:
library(DT) query.mm<-GDCquery(project = "MMRF-COMMPASS", data.category = "Transcriptome Profiling", data.type = "Gene Expression Quantification", workflow.type="HTSeq - FPKM", barcode = c("MMRF_2473","MMRF_2111", "MMRF_2362","MMRF_1824", "MMRF_1458","MRF_1361", "MMRF_2203","MMRF_2762", "MMRF_2680","MMRF_1797"))
getResults(query.mm, cols = c("cases.submitter_id","sample_type","cases")) %>% datatable(options = list(scrollX = TRUE, keys = TRUE))
Once logged in to the MMRF Research Gateway Web Portal, you can download dataset you are interested in and import it as a dataframe into R environment for the further analysis. Note:actually, files containing the data used by MMRFBiolinks functions are:
File name | Description -----|----- MMRF_CoMMpass_IA14_STAND_ALONE_TRTRESP.csv|containing the data about the response to treatment MMRF_CoMMpass_IA14a_All_Canonical_Variants.txt|containing the data about variants MMRF_CoMMpass_IA14_PER_PATIENT.csv|containing the data about (eg. age, sex, date of death, date of the last follow up)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.