In ShixiangWang/absCopyNumber: Tumor Purity, Ploidy and Absolute Copy Number Estimation

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-",
  cache = TRUE
)

absCopyNumber: Tumor Purity, Ploidy and Absolute Copy Number Estimator

suppressPackageStartupMessages({
  if("absCopyNumber" %in% installed.packages()[,1]) {
    library(absCopyNumber)
  } else {
    devtools::load_all("~/Documents/GitHub/AbsoluteCopyNumber/")
  }
})

The goal of absCopyNumber is to Estimate tumor purity, ploidy and absolute copy numbers from NGS (WGS, WES, Target Sequencing) and Microarray (SNP, aCGH etc.) data, based on statistical method from paper — "AbsCN-seq: a statistical method to estimate tumor purity, ploidy and absolute copy numbers from next-generation sequencing data".

Currently, most of code come from absCNseq package (not maintained), which used for NGS data. I modified some code and extend it for microarray data, like SNP array. The statistical method come from AbsCN-seq is very robust and easy to use, I think it can definitely handle microarray data.

Installation

You can install absCopyNumber from github with:

# install.packages("devtools")
devtools::install_github("ShixiangWang/absCopyNumber")

Example

The following I will show a basic example which shows you how to use absCopyNumber.

Load absCopyNumber

library(absCopyNumber)

Run absCopyNumber from Local Files

This step should identify local path of copy ratio and snv (optional) files. Here we use example files within package.

example_cn = system.file("extdata", "example.cn.txt.gz", package = "absCopyNumber")
example_snv = system.file("extdata", "example.snv.txt.gz", package = "absCopyNumber")

Bacause R can handle zipped files, so gzipped file is okay here. Of note, the input copy number file and somatic mutation file should following some rules:

1. Input copy ratio file ("example.cn.txt")

The input copy ratio file contains segmented copy ratio data, which can be generated by popular segmentation algorithms like 
the "DNAcopy" R package.
It should be a tab-delimited text file with five columns.  
Please use the EXACT header names as below.
1) "chrom"
The chromosome number of a segment. Must be a integer number from 1 to 22.
2) "loc.start"
The start position of a segment.
3) "loc.end"
The end position of a segment.
4) "eff.seg.len"
For exome sequencing, due to the nature of highly uneven coverage (zero coverage for introns), this column gives the number of base pairs with actual observed coverage. 
It can be derived by concatenate all the VARSCAN bins within a segment. 
Note that this length is usually much smaller than the length of the segment which is loc.end-loc.start+1   
5) "normalized.ratio"
The mean copy ratio (tumor DNA vs. germline DNA) of a segment. 
Note that the copy ratio should be normalized to eliminate any sequencing throughput difference between tumor and germline DNA. 
For example, samtools can be used to count total reads that were properly paired/aligned for tumor and germline DNA. 
The difference then need to be adjusted accordingly.  


2. Input SNV file ("example.snv.txt")

The input SNV file contains allele frequency data for somatic mutations.
It should be a tab-delimited text file with three columns.
Please use the EXACT header names as below.
1) "chrom"
The chromosome number of a somatic SNV. Must be a integer number from 1 to 22.
2) "position"
The genomic position of a somatic SNV.
3) "tumor_var_freq"
The proportion of reads supporting the somatic SNV allele. Must be a fraction number.

The example data come from WES platform, next we specify parameter to run absCopyNumber. We use run_fromLocal() function to process data specified as file path.

my.res.list <- run_fromLocal(seg.fn = example_cn, snv.fn = example_snv, platform="WES", min.seg.len=200, verbose = TRUE)

The result is a list contains all solutions and other orignal and model information. Solution result is a data.frame named searchRes in list.

knitr::kable(my.res.list$searchRes, align = 'c', caption = "Solution Data.Frame")

Generally we use top 1 solution, so the corresponding result has been stored at absCN data.frame of my.res.list. We can also mannually select best solution using our own knowledge. After selecting solution, we calculate absolute copy number by get_absCopyNumber function based on purity $\alpha$ and ploidy $\tau$.

# select i-th solution
i = 1
seg.CN <- get_absCopyNumber(my.res.list$seg.dat, my.res.list$searchRes[i,"alpha"], my.res.list$searchRes[i,"tau"])

Last, we plot orignal copy number (Raw) and absolute copy number (Absolute) on a same figure.

plot_absCopyNumber(seg.CN, chromnum=1)

Run absCopyNumber from data.frame

Sometimes, we process data in R console, so absCopyNumber provide function run_fromDF to substitute run_fromLocal. Similar arguments and procedure as above, thus we do not show details here.

cn_df = read.table(example_cn, sep = "\t", header = TRUE,  stringsAsFactors = FALSE)
snv_df = read.table(example_snv, sep = "\t", header = TRUE, stringsAsFactors = FALSE)

Run.

my.res.list2 = run_fromDF(seg.df = cn_df, snv.df = snv_df, platform = "WES", min.seg.len=200)

identical(my.res.list, my.res.list2)

ShixiangWang/absCopyNumber documentation built on May 28, 2019, 5:42 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ShixiangWang/absCopyNumber
Tumor Purity, Ploidy and Absolute Copy Number Estimation

In ShixiangWang/absCopyNumber: Tumor Purity, Ploidy and Absolute Copy Number Estimation

absCopyNumber: Tumor Purity, Ploidy and Absolute Copy Number Estimator

Installation

Example

Load absCopyNumber

Run absCopyNumber from Local Files

Run absCopyNumber from data.frame

R Package Documentation

Browse R Packages

We want your feedback!

ShixiangWang/absCopyNumber Tumor Purity, Ploidy and Absolute Copy Number Estimation

In ShixiangWang/absCopyNumber: Tumor Purity, Ploidy and Absolute Copy Number Estimation

absCopyNumber: Tumor Purity, Ploidy and Absolute Copy Number Estimator

Installation

Example

Load absCopyNumber

Run absCopyNumber from Local Files

Run absCopyNumber from data.frame

R Package Documentation

Browse R Packages

We want your feedback!

ShixiangWang/absCopyNumber
Tumor Purity, Ploidy and Absolute Copy Number Estimation