knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-" )
UCSCXenaTools is an R package for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq. Public omics data from UCSC Xena are supported through multiple turn-key Xena Hubs, which are a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others. Databases are normalized so they can be combined, linked, filtered, explored and downloaded.
Who is the target audience and what are scientific applications of this package?
Install stable release from CRAN with:
install.packages("UCSCXenaTools")
You can also install devel version of UCSCXenaTools from github with:
# install.packages("remotes") remotes::install_github("ropensci/UCSCXenaTools")
If you want to build vignette in local, please add two options:
remotes::install_github("ropensci/UCSCXenaTools", build_vignettes = TRUE, dependencies = TRUE)
All datasets are available at https://xenabrowser.net/datapages/.
Currently, UCSCXenaTools supports the following data hubs of UCSC Xena.
Users can update dataset list from the newest version of UCSC Xena by hand with XenaDataUpdate()
function, followed
by restarting R and library(UCSCXenaTools)
.
If any url of data hub is changed or a new data hub is online, please remind me by emailing to w_shixiang@163.com or opening an issue on GitHub.
Download UCSC Xena datasets and load them into R by UCSCXenaTools is a workflow with generate
, filter
, query
, download
and prepare
5 steps, which are implemented as XenaGenerate
, XenaFilter
, XenaQuery
, XenaDownload
and XenaPrepare
functions, respectively. They are very clear and easy to use and combine with other packages like dplyr
.
To show the basic usage of UCSCXenaTools, we will download clinical data of LUNG, LUAD, LUSC from TCGA (hg19 version) data hub. Users can learn more about UCSCXenaTools by running browseVignettes("UCSCXenaTools")
to read vignette.
UCSCXenaTools uses a data.frame
object (built in package) XenaData
to generate an instance of XenaHub
class, which records information of all datasets of UCSC Xena Data Hubs.
You can load XenaData
after loading UCSCXenaTools
into R.
library(UCSCXenaTools) data(XenaData) head(XenaData)
Select datasets.
# The options in XenaFilter function support Regular Expression XenaGenerate(subset = XenaHostNames=="tcgaHub") %>% XenaFilter(filterDatasets = "clinical") %>% XenaFilter(filterDatasets = "LUAD|LUSC|LUNG") -> df_todo df_todo
Query and download.
XenaQuery(df_todo) %>% XenaDownload() -> xe_download
Prepare data into R for analysis.
cli = XenaPrepare(xe_download) class(cli) names(cli)
Cite me by the following paper.
Wang et al., (2019). The UCSCXenaTools R package: a toolkit for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq. Journal of Open Source Software, 4(40), 1627, https://doi.org/10.21105/joss.01627 # For BibTex @article{Wang2019UCSCXenaTools, journal = {Journal of Open Source Software}, doi = {10.21105/joss.01627}, issn = {2475-9066}, number = {40}, publisher = {The Open Journal}, title = {The UCSCXenaTools R package: a toolkit for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq}, url = {https://dx.doi.org/10.21105/joss.01627}, volume = {4}, author = {Wang, Shixiang and Liu, Xuesong}, pages = {1627}, date = {2019-08-05}, year = {2019}, month = {8}, day = {5}, }
Cite UCSC Xena by the following paper.
Goldman, Mary, et al. "The UCSC Xena Platform for cancer genomics data visualization and interpretation." BioRxiv (2019): 326470.
For anyone who wants to contribute, please follow the guideline:
UCSCXenaTools.Rproj
with RStudiodevtools::check()
, and fix all errors, warnings and notesThis package is based on XenaR, thanks Martin Morgan for his work.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.