chordPredict: Predict the probability of homogolous recombination...

View source: R/chordPredict.R

chordPredictR Documentation

Predict the probability of homogolous recombination deficiency using mutational signatures

Description

A wrapper for predict.randomForest() from the randomForest package

Usage

chordPredict(
  features,
  rf.model = CHORD,
  hrd.cutoff = 0.5,
  trans.func = NULL,
  min.indel.load = 50,
  min.sv.load = 30,
  min.msi.indel.rep = 14000,
  do.bootstrap = F,
  bootstrap.iters = 20,
  bootstrap.quantiles = c(0.05, 0.5, 0.95),
  detailed.remarks = T,
  show.features = F,
  verbose = T
)

Arguments

features

The output of extractSigsChord(), which is a dataframe containing the SNV, indel and SV context counts.

rf.model

The random forest model. Defaults to CHORD.

hrd.cutoff

Default=0.5. Samples greater or equal to this cutoff will be marked as HRD (is_hrd==TRUE).

trans.func

Function used to transform raw features. Raw features should be in the format: list(snv=matrix(), indel=matrix(), sv=matrix())

min.indel.load

Default=50. The minimum number of indels required to make an accurate HRD prediction. Samples with fewer indels than this value will be marked as is_hrd==NA (HR status could not be confidently determined).

min.sv.load

Default=30. The minimum number of SVs required to make an accurate prediction of BRCA1-type vs. BRCA2-type HRD. Samples with fewer SVs than this value will be marked as hrd_type==NA (HRD type could not be confidently determined).

min.msi.indel.rep

Default=14000 (changing this value is not advised). Samples with more indels within repeats than this threshold will be considered to have microsatellite instability.

do.bootstrap

Test the stability of prediction probabilities? NOTE: this is computationally expensive. Resamples the feature vector for each sample (number of times provided by bootstrap.iters) and calculates HRD probabilities for each iteration. Returns the probabilities at the quantiles specifying in bootstrap.quantiles

bootstrap.iters

Number of resampling iterations for determining the confidence intervals

bootstrap.quantiles

A numeric vector of length 2 specifying the quantiles used to calculate the confidence intervals

detailed.remarks

If TRUE, shows min.indel.load and min.sv.load numbers in the remarks columns

show.features

If TRUE, appends features to output

verbose

Show messages?

Value

A dataframe containing the HRD probabilities, bootstrap probabilities, and input features

Examples

## Extract mutation contexts
vcf_dir <- '/path_to_vcfs/'
vcf_snv <- paste0(vcf_dir,'SampleX_post_processed_v2.2.vcf.gz')
vcf_indel <- paste0(vcf_dir,'SampleX_post_processed_v2.2.vcf.gz')
vcf_sv <- paste0(vcf_dir,'SampleX_somaticSV_bpi.vcf.gz')
contexts <- extractSigsChord(vcf_snv, vcf_indel, vcf_sv, sample.name='SampleX')

## Predict HRD probability with CHORD
chordPredict(contexts)

luannnguyen/CHORD documentation built on Aug. 25, 2023, 10:04 a.m.