battenberg | R Documentation |
Run the Battenberg pipeline
battenberg(
tumourname,
normalname,
tumour_data_file,
normal_data_file,
imputeinfofile,
g1000prefix,
problemloci,
gccorrectprefix = NULL,
repliccorrectprefix = NULL,
g1000allelesprefix = NA,
ismale = NA,
data_type = "wgs",
impute_exe = "impute2",
allelecounter_exe = "alleleCounter",
nthreads = 8,
platform_gamma = 1,
phasing_gamma = 1,
segmentation_gamma = 10,
segmentation_kmin = 3,
phasing_kmin = 1,
clonality_dist_metric = 0,
ascat_dist_metric = 1,
min_ploidy = 1.6,
max_ploidy = 4.8,
min_rho = 0.1,
min_goodness = 0.63,
uninformative_BAF_threshold = 0.51,
min_normal_depth = 10,
min_base_qual = 20,
min_map_qual = 35,
calc_seg_baf_option = 3,
skip_allele_counting = F,
skip_preprocessing = F,
skip_phasing = F,
externalhaplotypefile = NA,
usebeagle = FALSE,
beaglejar = NA,
beagleref.template = NA,
beagleplink.template = NA,
beaglemaxmem = 10,
beaglenthreads = 1,
beaglewindow = 40,
beagleoverlap = 4,
javajre = "java",
write_battenberg_phasing = T,
multisample_relative_weight_balanced = 0.25,
multisample_maxlag = 100,
segmentation_gamma_multisample = 5,
snp6_reference_info_file = NA,
apt.probeset.genotype.exe = "apt-probeset-genotype",
apt.probeset.summarize.exe = "apt-probeset-summarize",
norm.geno.clust.exe = "normalize_affy_geno_cluster.pl",
birdseed_report_file = "birdseed.report.txt",
heterozygousFilter = "none",
prior_breakpoints_file = NULL,
GENOMEBUILD = "hg19"
)
tumourname |
Tumour identifier, this is used as a prefix for the output files. If allele counts are supplied separately, they are expected to have this identifier as prefix. |
normalname |
Matched normal identifier, this is used as a prefix for the output files. If allele counts are supplied separately, they are expected to have this identifier as prefix. |
tumour_data_file |
A BAM or CEL file for the tumour |
normal_data_file |
A BAM or CEL file for the normal |
imputeinfofile |
Full path to a Battenberg impute info file with pointers to Impute2 reference data |
g1000prefix |
Full prefix path to 1000 Genomes SNP loci data, as part of the Battenberg reference data |
problemloci |
Full path to a problem loci file that contains SNP loci that should be filtered out |
gccorrectprefix |
Full prefix path to GC content files, as part of the Battenberg reference data, not required for SNP6 data (Default: NULL) |
repliccorrectprefix |
Full prefix path to replication timing files, as part of the Battenberg reference data, not required for SNP6 data (Default: NULL) |
g1000allelesprefix |
Full prefix path to 1000 Genomes SNP alleles data, as part of the Battenberg reference data, not required for SNP6 data (Default: NA) |
ismale |
A boolean set to TRUE if the donor is male, set to FALSE if female, not required for SNP6 data (Default: NA) |
data_type |
String that contains either wgs or snp6 depending on the supplied input data (Default: wgs) |
impute_exe |
Pointer to the Impute2 executable (Default: impute2, i.e. expected in $PATH) |
allelecounter_exe |
Pointer to the alleleCounter executable (Default: alleleCounter, i.e. expected in $PATH) |
nthreads |
The number of concurrent processes to use while running the Battenberg pipeline (Default: 8) |
platform_gamma |
Platform scaling factor, suggestions are set to 1 for wgs and to 0.55 for snp6 (Default: 1) |
phasing_gamma |
Gamma parameter used when correcting phasing mistakes (Default: 1) |
segmentation_gamma |
The gamma parameter controls the size of the penalty of starting a new segment during segmentation. It is therefore the key parameter for controlling the number of segments (Default: 10) |
segmentation_kmin |
Kmin represents the minimum number of probes/SNPs that a segment should consist of (Default: 3) |
phasing_kmin |
Kmin used when correcting for phasing mistakes (Default: 3) |
clonality_dist_metric |
Distance metric to use when choosing purity/ploidy combinations (Default: 0) |
ascat_dist_metric |
Distance metric to use when choosing purity/ploidy combinations (Default: 1) |
min_ploidy |
Minimum ploidy to be considered (Default: 1.6) |
max_ploidy |
Maximum ploidy to be considered (Default: 4.8) |
min_rho |
Minimum purity to be considered (Default: 0.1) |
min_goodness |
Minimum goodness of fit required for a purity/ploidy combination to be accepted as a solution (Default: 0.63) |
uninformative_BAF_threshold |
The threshold beyond which BAF becomes uninformative (Default: 0.51) |
min_normal_depth |
Minimum depth required in the matched normal for a SNP to be considered as part of the wgs analysis (Default: 10) |
min_base_qual |
Minimum base quality required for a read to be counted when allele counting (Default: 20) |
min_map_qual |
Minimum mapping quality required for a read to be counted when allele counting (Default: 35) |
calc_seg_baf_option |
Sets way to calculate BAF per segment: 1=mean, 2=median, 3=ifelse median==0 | 1, mean, median (Default: 3) |
skip_allele_counting |
Provide TRUE when allele counting can be skipped (i.e. its already done) (Default: FALSE) |
skip_preprocessing |
Provide TRUE when preprocessing is already complete (Default: FALSE) |
skip_phasing |
Provide TRUE when phasing is already complete (Default: FALSE) |
externalhaplotypefile |
Vcf containing externally obtained haplotype blocks (Default: NA) |
usebeagle |
Should use beagle5 instead of impute2 Default: FALSE |
beaglejar |
Full path to Beagle java jar file Default: NA |
beagleref.template |
Full path template to Beagle reference files where the chromosome is replaced by 'CHROMNAME' Default: NA |
beagleplink.template |
Full path template to Beagle plink files where the chromosome is replaced by 'CHROMNAME' Default: NA |
beaglemaxmem |
Integer Beagle max heap size in Gb Default: 10 |
beaglenthreads |
Integer number of threads used by beagle5 Default:1 |
beaglewindow |
Integer size of the genomic window for beagle5 (cM) Default:40 |
beagleoverlap |
Integer size of the overlap between windows beagle5 Default:4 |
javajre |
Path to the Java JRE executable, only required for haplotype reconstruction with Beagle (default java, i.e. in $PATH) |
write_battenberg_phasing |
Write the Battenberg phasing results as vcf to disk, e.g. for multisample cases (Default: TRUE) |
multisample_relative_weight_balanced |
Relative weight to give to haplotype info from a sample without allelic imbalance in the region (Default: 0.25) |
multisample_maxlag |
Maximal number of upstream SNPs used in the multisample haplotyping to inform the haplotype at another SNP (Default: 100) |
segmentation_gamma_multisample |
The gamma parameter controls the size of the penalty of starting a new segment during mutlisample segmentation. It is the key parameter for controlling the number of segments (Default: 10) |
snp6_reference_info_file |
Reference files for the SNP6 pipeline only (Default: NA) |
apt.probeset.genotype.exe |
Helper tool for extracting data from CEL files, SNP6 pipeline only (Default: apt-probeset-genotype) |
apt.probeset.summarize.exe |
Helper tool for extracting data from CEL files, SNP6 pipeline only (Default: apt-probeset-summarize) |
norm.geno.clust.exe |
Helper tool for extracting data from CEL files, SNP6 pipeline only (Default: normalize_affy_geno_cluster.pl) |
birdseed_report_file |
Sex inference output file, SNP6 pipeline only (Default: birdseed.report.txt) |
heterozygousFilter |
Legacy option to set a heterozygous SNP filter, SNP6 pipeline only (Default: "none") |
prior_breakpoints_file |
A two column file with prior breakpoints to be used during segmentation (Default: NULL) |
GENOMEBUILD |
Genome build upon which the 1000G SNP coordinates were obtained (Default: hg19; options: "hg19" or "hg38") |
sd11, jdemeul, Naser Ansari-Pour
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.