battenberg: Run the Battenberg pipeline

View source: R/battenberg.R

battenbergR Documentation

Run the Battenberg pipeline

Description

Run the Battenberg pipeline

Usage

battenberg(
  tumourname,
  normalname,
  tumour_data_file,
  normal_data_file,
  imputeinfofile,
  g1000prefix,
  problemloci,
  gccorrectprefix = NULL,
  repliccorrectprefix = NULL,
  g1000allelesprefix = NA,
  ismale = NA,
  data_type = "wgs",
  impute_exe = "impute2",
  allelecounter_exe = "alleleCounter",
  nthreads = 8,
  platform_gamma = 1,
  phasing_gamma = 1,
  segmentation_gamma = 10,
  segmentation_kmin = 3,
  phasing_kmin = 1,
  clonality_dist_metric = 0,
  ascat_dist_metric = 1,
  min_ploidy = 1.6,
  max_ploidy = 4.8,
  min_rho = 0.1,
  min_goodness = 0.63,
  uninformative_BAF_threshold = 0.51,
  min_normal_depth = 10,
  min_base_qual = 20,
  min_map_qual = 35,
  calc_seg_baf_option = 3,
  skip_allele_counting = F,
  skip_preprocessing = F,
  skip_phasing = F,
  externalhaplotypefile = NA,
  usebeagle = FALSE,
  beaglejar = NA,
  beagleref.template = NA,
  beagleplink.template = NA,
  beaglemaxmem = 10,
  beaglenthreads = 1,
  beaglewindow = 40,
  beagleoverlap = 4,
  javajre = "java",
  write_battenberg_phasing = T,
  multisample_relative_weight_balanced = 0.25,
  multisample_maxlag = 100,
  segmentation_gamma_multisample = 5,
  snp6_reference_info_file = NA,
  apt.probeset.genotype.exe = "apt-probeset-genotype",
  apt.probeset.summarize.exe = "apt-probeset-summarize",
  norm.geno.clust.exe = "normalize_affy_geno_cluster.pl",
  birdseed_report_file = "birdseed.report.txt",
  heterozygousFilter = "none",
  prior_breakpoints_file = NULL,
  GENOMEBUILD = "hg19"
)

Arguments

tumourname

Tumour identifier, this is used as a prefix for the output files. If allele counts are supplied separately, they are expected to have this identifier as prefix.

normalname

Matched normal identifier, this is used as a prefix for the output files. If allele counts are supplied separately, they are expected to have this identifier as prefix.

tumour_data_file

A BAM or CEL file for the tumour

normal_data_file

A BAM or CEL file for the normal

imputeinfofile

Full path to a Battenberg impute info file with pointers to Impute2 reference data

g1000prefix

Full prefix path to 1000 Genomes SNP loci data, as part of the Battenberg reference data

problemloci

Full path to a problem loci file that contains SNP loci that should be filtered out

gccorrectprefix

Full prefix path to GC content files, as part of the Battenberg reference data, not required for SNP6 data (Default: NULL)

repliccorrectprefix

Full prefix path to replication timing files, as part of the Battenberg reference data, not required for SNP6 data (Default: NULL)

g1000allelesprefix

Full prefix path to 1000 Genomes SNP alleles data, as part of the Battenberg reference data, not required for SNP6 data (Default: NA)

ismale

A boolean set to TRUE if the donor is male, set to FALSE if female, not required for SNP6 data (Default: NA)

data_type

String that contains either wgs or snp6 depending on the supplied input data (Default: wgs)

impute_exe

Pointer to the Impute2 executable (Default: impute2, i.e. expected in $PATH)

allelecounter_exe

Pointer to the alleleCounter executable (Default: alleleCounter, i.e. expected in $PATH)

nthreads

The number of concurrent processes to use while running the Battenberg pipeline (Default: 8)

platform_gamma

Platform scaling factor, suggestions are set to 1 for wgs and to 0.55 for snp6 (Default: 1)

phasing_gamma

Gamma parameter used when correcting phasing mistakes (Default: 1)

segmentation_gamma

The gamma parameter controls the size of the penalty of starting a new segment during segmentation. It is therefore the key parameter for controlling the number of segments (Default: 10)

segmentation_kmin

Kmin represents the minimum number of probes/SNPs that a segment should consist of (Default: 3)

phasing_kmin

Kmin used when correcting for phasing mistakes (Default: 3)

clonality_dist_metric

Distance metric to use when choosing purity/ploidy combinations (Default: 0)

ascat_dist_metric

Distance metric to use when choosing purity/ploidy combinations (Default: 1)

min_ploidy

Minimum ploidy to be considered (Default: 1.6)

max_ploidy

Maximum ploidy to be considered (Default: 4.8)

min_rho

Minimum purity to be considered (Default: 0.1)

min_goodness

Minimum goodness of fit required for a purity/ploidy combination to be accepted as a solution (Default: 0.63)

uninformative_BAF_threshold

The threshold beyond which BAF becomes uninformative (Default: 0.51)

min_normal_depth

Minimum depth required in the matched normal for a SNP to be considered as part of the wgs analysis (Default: 10)

min_base_qual

Minimum base quality required for a read to be counted when allele counting (Default: 20)

min_map_qual

Minimum mapping quality required for a read to be counted when allele counting (Default: 35)

calc_seg_baf_option

Sets way to calculate BAF per segment: 1=mean, 2=median, 3=ifelse median==0 | 1, mean, median (Default: 3)

skip_allele_counting

Provide TRUE when allele counting can be skipped (i.e. its already done) (Default: FALSE)

skip_preprocessing

Provide TRUE when preprocessing is already complete (Default: FALSE)

skip_phasing

Provide TRUE when phasing is already complete (Default: FALSE)

externalhaplotypefile

Vcf containing externally obtained haplotype blocks (Default: NA)

usebeagle

Should use beagle5 instead of impute2 Default: FALSE

beaglejar

Full path to Beagle java jar file Default: NA

beagleref.template

Full path template to Beagle reference files where the chromosome is replaced by 'CHROMNAME' Default: NA

beagleplink.template

Full path template to Beagle plink files where the chromosome is replaced by 'CHROMNAME' Default: NA

beaglemaxmem

Integer Beagle max heap size in Gb Default: 10

beaglenthreads

Integer number of threads used by beagle5 Default:1

beaglewindow

Integer size of the genomic window for beagle5 (cM) Default:40

beagleoverlap

Integer size of the overlap between windows beagle5 Default:4

javajre

Path to the Java JRE executable, only required for haplotype reconstruction with Beagle (default java, i.e. in $PATH)

write_battenberg_phasing

Write the Battenberg phasing results as vcf to disk, e.g. for multisample cases (Default: TRUE)

multisample_relative_weight_balanced

Relative weight to give to haplotype info from a sample without allelic imbalance in the region (Default: 0.25)

multisample_maxlag

Maximal number of upstream SNPs used in the multisample haplotyping to inform the haplotype at another SNP (Default: 100)

segmentation_gamma_multisample

The gamma parameter controls the size of the penalty of starting a new segment during mutlisample segmentation. It is the key parameter for controlling the number of segments (Default: 10)

snp6_reference_info_file

Reference files for the SNP6 pipeline only (Default: NA)

apt.probeset.genotype.exe

Helper tool for extracting data from CEL files, SNP6 pipeline only (Default: apt-probeset-genotype)

apt.probeset.summarize.exe

Helper tool for extracting data from CEL files, SNP6 pipeline only (Default: apt-probeset-summarize)

norm.geno.clust.exe

Helper tool for extracting data from CEL files, SNP6 pipeline only (Default: normalize_affy_geno_cluster.pl)

birdseed_report_file

Sex inference output file, SNP6 pipeline only (Default: birdseed.report.txt)

heterozygousFilter

Legacy option to set a heterozygous SNP filter, SNP6 pipeline only (Default: "none")

prior_breakpoints_file

A two column file with prior breakpoints to be used during segmentation (Default: NULL)

GENOMEBUILD

Genome build upon which the 1000G SNP coordinates were obtained (Default: hg19; options: "hg19" or "hg38")

Author(s)

sd11, jdemeul, Naser Ansari-Pour


Wedge-Oxford/battenberg documentation built on Aug. 4, 2023, 6:27 p.m.