knitr::opts_chunk$set(echo = TRUE) devtools::load_all(".") library(HRDX) data("sub01.ploidy", package = "HRDX") data("sub01.segments", package = "HRDX")
\
HRDX comes with a sample data set, the result of running Sequenza VX.X on TCGA subject Y. "sub01.segments" contains information on the segments of the genome, and copynumber status. "sub01.ploidy" contains ploidy estimates. This data can be used for HRD or aneuploid calculation.
head(sub01.segments) print(sub01.ploidy)
\
Preprocessing
The package needs to define the centromere and telomere positions for each chromosome; the overall chromosome size; the length of each segment; the length of the gaps between segments; the p, q, and cross arms; and allelic imbalance. This is all achieved using the function "preprocessHRD". All positions are stored in the reference data within the package; both grch37 and grch38 are supported. The other stuff is calculated from the data.
seq.dat <- sub01.segments[ sub01.segments$chromosome == "chr1",] preprocessHRD( seq.dat, "grch38" )
\
Quality Control
Part of preprocessing is identifying errors in defining segments. For example, on chromosome 14:
cols <- c(1,2,3,10,11,12) seq.dat <- sub01.segments[ sub01.segments$chromosome == "chr14", ] seq.dat[,cols]
notice that the last two segments have a very small gap between them, with identical copynumber status. This suggests that this is really one full segment being incorrectly identified as two. The function combineSegs is called within preprocessHRD to correct these:
seq.dat <- preprocessHRD( seq.dat, "grch38" ) seq.dat[,cols]
After the preprocessing function runs, the last two segments are combined into one.
\ NTAI
is normalized by removing CN segments (why)
HRD Measures
Once the data has been preprocessed, you can calculate HRD measures LST, LOH, TAI, and NTAI. Each metric is retrieved with a corresponding function.
\newline
\
LST
LST is the number of large state transition events.
seq.dat <- preprocessHRD( sub01.segments, "grch38" ) getLST( seq.dat )
\newline
\
LOH
LOH is the number of loss of heterozygosity events.
getLOH( seq.dat )
\
NTAI
Raw NTAI is the number of nontelomeric allelic imbalance events. NTAI is normalized by removing main copynumber segments.
CN.dat <- getCNt( seq.dat ) getNTAI.raw( seq.dat ) getNTAI.norm( seq.dat, CN.dat)
\
HRD Score
An HRD Score is some linear combination of raw or normalized values LST, LOH, NTAI (or TAI). It is intended to return the HRD score without having to specify all the individual arguments. Optionally, you can rescale the data to fit a range of 0 - 100. For a single subject, an HRD score can be easily be calculated from raw data in just a few steps:
seq.dat <- preprocessHRD( sub01.segments, "grch38" ) CN.dat <- getCNt( seq.dat ) HRD <- getHRD.Score( seq.dat, CN.dat ) print(HRD)
\
Ploidy Normalization
HRD will increase with ploidy, but the relationship may vary by cancer type. It is common to observe a logarithmic relationship between the two. In this case, ploidy-effects can be accounted for by fitting a linear model of ploidy on the log of the HRD score, and taking the residuals. This correction is implemented in the function ploidyNorm.log.
Aneuploidy processing uses different criteria than HRD. Therefore, you should run getAneuploidy on the raw data- dont run preprocessHRD.
ap.chr1 <- getAneuploidy(sub01.segments, sub01.ploidy, 1) # summary data ap.chr1[[1]] # full data ap.chr1[[2]]
ap.dat <- getAneuploidyGenome(sub01.segments, sub01.ploidy) ap.dat[[1]] getAneuploidyScore(ap.dat[[1]]) plotAneuploidy(ap.dat[[2]])
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.