knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE )
# devtools::load_all() # library(flexmix) # library(QDNAseq) # library(NMF) if(!require("VSHunter")){ devtools::load_all() }
The goal of VSHunter is to capture variation signature from genomic data. For now, we decode copy number pattern from absolute copy number profile. This package collects R code from paper Copy number signatures and mutational processes in ovarian carcinoma and tidy them as a open source R package for bioinformatics community.
Before you use this tool, you have to obtain absolute copy number profile for samples via software like ABSOLUTE v2, QDNASeq etc..
knitr::include_graphics(path = "https://media.springernature.com/m685/springer-static/image/art%3A10.1038%2Fs41588-018-0179-8/MediaObjects/41588_2018_179_Fig1_HTML.png")
You can install UCSCXenaTools from github with:
# install.packages("devtools") devtools::install_github("ShixiangWang/VSHunter", build_vignettes = TRUE)
Load package.
library(VSHunter)
Load example data:
load(system.file("extdata/example_cn_list.RData", package = "VSHunter"))
tcga_segTabs is a list contain absolute copy number profile for multiple samples, each sample is a data.frame in the list.
Obtain CNV summary info.
tcga_frac = cnv_getLengthFraction(tcga_segTabs)
tcga_features = cnv_derivefeatures(CN_data = tcga_segTabs, cores = 1, genome_build = "hg19")
tcga_components = cnv_fitMixModels(CN_features = tcga_features, cores = 4)
Generate a sample-by-component matrix representing the sum of posterior probabilities of each copy-number event being assigned to each component.
tcga_sample_component_matrix = cnv_generateSbCMatrix(tcga_features, tcga_components, cores = 4)
tcga_sig_choose = cnv_chooseSigNumber(tcga_sample_component_matrix, nrun = 10, cores = 4)
Do not test a randomise data (save time).
tcga_sig_choose2 = cnv_chooseSigNumber(tcga_sample_component_matrix, nrun = 10, cores = 4, testRandom = FALSE)
tcga_signatures = cnv_extractSignatures(tcga_sample_component_matrix, nsig = 3, cores = 4)
Function cnv_autoCaptureSignatures() finish three steps (choose best number of signatures, extract signatures and quantify exposure) above in an antomated way. The arguments of this function are same as cnv_chooseSigNumber().
tcga_results = cnv_autoCaptureSignatures(tcga_sample_component_matrix, nrun=10, cores = 4)
The result object is a list which contains all results need fro downstream analysis, include NMF result related to best rank value, signature matrix, absolute and relative exposure (contribution) and best rank survey etc..
This feature is implemented in cnv_pipe() function.
Visualize CNV distribution by normalized CN length or chromosome.
cnv_plotDistributionProfile(tcga_frac)
cnv_plotDistributionProfile(tcga_frac, mode = "cd")
cnv_plotDistributionProfile(tcga_frac, mode = "cd" , fill = TRUE)
Plot functions:
cnv_plotDistributionProfile() cnv_plotFeatureDistribution() cnv_plotMixComponents() cnv_plotSignatures()
If you wanna thank my work for this package, you can also cite (and inlucde link of this package - https://github.com/ShixiangWang/VSHunter):
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.