docs/setup/setup-for-running-analysis.md

description: Master reference file descriptions

Setup for Running Analysis

Master reference file

An example of this file can be found in the data/ folder

{% hint style="danger" %} For not required columns, leave the cell blank if you don't have the information {% endhint %}

Column Names Information Specified Specified format (If any) Notes Required cmo_patient_id Patient ID None Results are presented per unique patient ID Y cmo_sample_id_plasma Plasma Sample ID None Y cmo_sample_id_normal Buffy Coat Sample ID None N bam_path_normal Unfiltered buffy coat bam Absolute file paths N paired Whether the plasma has buffy coat Paired/Unpaired Y sex Sex M/F Unrequired N collection_date Collection time points for graphing

dates (m/d/y)

OR

character strings (i.e. the sample IDs)

the format should be consistent within the file Y dmp_patient_id DMP patient ID *Patient IDs* All DMP samples from this patient ID will be pulled N bam_path_plasma_duplex Duplex bam Absolute file paths Y bam_path_plasma_simplex Simplex bam Absolute file paths Y maf_path maf file Absolute file paths fillout_filtered.maf (required columns here) Y cna_path cna file Absolute file paths sample level cna file (helper script included) N sv_path sv file Absolute file paths N

{% hint style="warning" %} Creating this file might be a hassle. Helper script could possibly be made to help with this {% endhint %}

Required Columns for maf file

Hugo_Symbol,Chromosome,Start_Position,End_Position,Tumor_Sample_Barcode,Variant_Classification,HGVSp_Short,Reference_Allele,Tumor_Seq_Allele2,D_t_alt_count_fragment


msk-access/access_data_analysis documentation built on Nov. 13, 2023, 12:43 p.m.