docs/analysis/compile-reads.md

description: Step 1 -- intra-patient genotyping

Compile Reads

The first step of the pipeline is to genotype all the variants of interest in the included samples (this means plasma, buffy coat, DMP tumor, DMP normal, and donor samples). Once we obtained the read counts at every loci of every sample, we then generate a table of VAFs and call status for each variant in all samples within a patient in the next step.

Usage

Rscript R/compile_reads.R -h                                        
usage: R/compile_reads.R [-h] [-m MASTERREF] [-o RESULTSDIR]
                         [-pb POOLEDBAMDIR] [-fa FASTAPATH]
                         [-gt GENOTYPERPATH] [-dmp DMPDIR] [-mb MIRRORBAMDIR]
                         [-dmpk DMPKEYPATH]

optional arguments:
  -h, --help            show this help message and exit
  -m MASTERREF, --masterref MASTERREF
                        File path to master reference file
  -o RESULTSDIR, --resultsdir RESULTSDIR
                        Output directory
  -pb POOLEDBAMDIR, --pooledbamdir POOLEDBAMDIR
                        Directory for all pooled bams [default]
  -fa FASTAPATH, --fastapath FASTAPATH
                        Reference fasta path [default]
  -gt GENOTYPERPATH, --genotyperpath GENOTYPERPATH
                        Genotyper executable path [default]
  -dmp DMPDIR, --dmpdir DMPDIR
                        Directory of clinical DMP IMPACT repository [default]
  -mb MIRRORBAMDIR, --mirrorbamdir MIRRORBAMDIR
                        Mirror BAM file directory [default]
  -dmpk DMPKEYPATH, --dmpkeypath DMPKEYPATH
                        DMP mirror BAM key file [default]

Default

Default options can be found here

What compile_reads.R does

For each patient

| Sample_Barcode | duplex_bams | simplex_bams | standard_bam | Sample_Type | dmp_patient_id | | :--- | :--- | :--- | :--- | :--- | :--- | | plasma sample id | /duplex/bam | /simplex/bam | NA | duplex | P-xxxxxxx | | buffy coat id | NA | NA | /unfiltered/bam | unfilterednormal | P-xxxxxxx | | DMP Tumor ID | NA | NA | /DMP/bam | DMP_Tumor | P-xxxxxxx | | DMP Normal ID | NA | NA | /DMP/bam | DMP_Normal | P-xxxxxxx |

Afterwards, for donor bams



msk-access/access_data_analysis documentation built on Nov. 13, 2023, 12:43 p.m.