PCAWG_read_table_and_evaluate_DBS: Read a table that specifies how to process VCFs and mini-BAMS...

Description Usage Arguments Details

View source: R/PCAWG_read_table_and_evaluate_DBS.R

Description

Read a table that specifies how to process VCFs and mini-BAMS to evaluate DBS calls, for PCAWG Collaboratory data.

Usage

1
2
3
4
5
6
7
8
PCAWG_read_table_and_evaluate_DBS(
  in.table,
  in.vcf.dir,
  minibam.dir,
  out.vcf.dir = in.vcf.dir,
  bam.suffix = "_dbs_srt",
  verbose = 1
)

Arguments

in.table

the file path of the table to process; in production, .../DBSverify/data-raw/collaboratory_bams_full_2021_07_13.csv, for testing .../DBSverify/data-raw/short_collaboratory_bams.csv.

in.vcf.dir

The path to the directory containing the DBS VCF files.

minibam.dir

The path to the directory containing the mini BAMs.

out.vcf.dir

The path to the directory in which to put the "evaluated" DBS VCF files.

bam.suffix

String to add to end of BAM file name; depends on the conventions used by the script (run on the Collaboratory) that generated the miniBAMs.

verbose

If > 0 generate some progress messages.

Details

This is a specialized function for processing PCAWG data from the ICGC (International Cancer Genome) "Collaboratory" cloud computing system, once the miniBAMs have been created in the Collaboratory and downloaded. The in.table and associated BED files were used to specify the contents of the miniBAMs. The result consists of the "evaluated" DBS VCF files. The naming of the input and output VCF files and the mini BAMs is governed by the contents of in.table, with the VCF file names incorporating the "aliquot_id" and the miniBAM names based on the icgc_donor_id and the T_Specimen ID and N_Specimen ID.


steverozen/DBSverify documentation built on Dec. 23, 2021, 5:34 a.m.