run_UNMASC: run_UNMASC

View source: R/UNMASC.R

run_UNMASCR Documentation

run_UNMASC

Description

This function implements the UNMASC workflow from processing formatted dataframes of variants toward outputting annotated variants.

Usage

run_UNMASC(
  tumorID,
  outdir,
  vcf = NULL,
  tBAM_fn = NULL,
  bed_centromere_fn,
  dict_chrom_fn,
  qscore_thres = 30,
  exac_thres = 0.005,
  ad_thres = 5,
  rd_thres = 10,
  cut_BAF = 0.05,
  minBQ = 13,
  minMQ = 40,
  eps_thres = 0.5,
  psi_thres = 0.02,
  hg = "19",
  binom = TRUE,
  gender = NA,
  flag_samp_depth_thres = c(1000, 50),
  ncores = 1
)

Arguments

tumorID

A character string unique to a tumor's output directory and files.

outdir

A character string of the full path and working directory specifying where intermediate and final files are stored.

vcf

If run_UNMASC() has been run to create an image.rds file, this can be set to NULL. Otherwise, it is a dataframe containing "Chr" (e.g. 'chr1'), "Position" (e.g. 1000), "Ref" (e.g. "A"), "Alt" (e.g. "T"), "Qscore" (e.g. 30), "GeneName" (e.g. 'TP53'), "TARGET" (e.g. 'YES', 'NO'), "EXONIC" (e.g. 'YES', 'NO'), "IMPACT" (e.g. 'MODIFIER','LOW','MODERATE','HIGH'), "STUDYNUMBER" (e.g. "normal_1"), "CosmicOverlaps" (e.g. 10), "ThsdG_AF" (e.g. 0.5), "EXAC_AF" (e.g. 0.1), "nAD" (e.g. 10), "nRD" (e.g. 20), "tAD" (e.g. 10), "tRD" (e.g. 20) columns corresponding to chromosome, position, reference allele, alternate allele, variant quality score, Hugo symbol, on/off target status, exonic/intronic status, snpEff impact, normal control ID, number of COSMIC overlaps, 1000 Genomes population allele frequency, ExAC population allele frequency, control alternate depth, control reference depth, tumor alternate depth, tumor reference depth, respectively.

tBAM_fn

A character string specifying the full path to the tumor's BAM file. Can be set to NULL if strand.rds already exists.

bed_centromere_fn

Centromere regions filename. This should be tab delimited without headers containing columns contig (e.g. 'chr1'), start position (e.g. 100), and end position (e.g. 100000).

dict_chrom_fn

Chromosome lengths file. This can be constructed from the output of 'samtools view -H' applied to a bam file.

qscore_thres

A numeric value specifying a minimum allowed Qscore or QUAL value for variant calls.

exac_thres

A numeric value specifying a maximum allowed ExAC allele frequency for variant calls.

ad_thres

A numeric value specifying a minimum allowed number of alternate read counts for variant calls.

rd_thres

A numeric value specifying a minimum allowed total read depth for variant calls.

cut_BAF

A numeric value specifying the BAF threshold to remove variants before running tumor VAF segmentation. By default, cut_BAF is set to 0.05.

minBQ

An integer for minimum base quality.

minMQ

An integer for minimum mapping quality.

eps_thres

A numeric value specifying the threshold for determining a H2M segment.

psi_thres

A numeric value specifying the threshold for determining a H2M segment.

hg

A character string for the human genome. This is used for labeling output plots.

binom

Boolean for whether or not to model read counts as binomial or beta-binomial distributed.

gender

A single string specifying the subject's gender. Valid inputs are "MALE" and "FEMALE". By default tumor variant read counts on chromosomes 1 thru 22 are segmented. If gender = "FEMALE", chromosome X is also segmented.

flag_samp_depth_thres

A vector of two integers thresholds. The first is a count of unique loci achieving a higher total tumor read depth than the second integer specified. For example, if flag_samp_depth_thres = c(1000,50), then if less than 1000 unique loci in the tumor have total depths greater than 50, the sample is flagged and UNMASC exits.

ncores

A positive integer for the number of threads to use for calculating strand-specific read counts.

Value

Null from function. Outputs UNMASC results to files.


pllittle/UNMASC documentation built on June 1, 2025, 1 p.m.