read_in_maf: Load MAF somatic mutation data

View source: R/internal_read_maf.R

read_in_mafR Documentation

Load MAF somatic mutation data

Description

Load MAF data from a text file or data table into your CESAnalysis. If column names don't match MAF format specifications (Chromosome, Start_Position, etc., with Tumor_Sample_Barcode used as the sample ID column), you can supply your own column names. When your CESAnalysis has defined sample groups (see ?CESAnalysis), specify "group_col". By default, data is assumed to be derived from whole-exome sequencing. Whole-genome data and targeted sequencing data are also supported when the "coverage" option is specified. If the data you are loading is from a different genome build than your CESAnalysis, you can use the "chain_file" option to supply a UCSC-style chain file, and your MAF coordinates will be automatically converted with rtracklayer's version of liftOver.

Usage

read_in_maf(
  maf,
  refset_env,
  chr_col = "Chromosome",
  start_col = "Start_Position",
  ref_col = "Reference_Allele",
  tumor_allele_col = "guess",
  sample_col = "Unique_Patient_Identifier",
  more_cols = NULL,
  chain_file = NULL,
  separate_old_problems = FALSE
)

Arguments

maf

Path of tab-delimited text file in MAF format, or an MAF in data.table or data.frame format

refset_env

a refset data environment

chr_col

column name with chromosome data (Chromosome)

start_col

column name with start position (Start_Position)

ref_col

column name with reference allele data (Reference_Allele)

tumor_allele_col

column name with alternate allele data; by default, values from Tumor_Seq_Allele2 and Tumor_Seq_Allele1 columns are used.

sample_col

column name with sample ID data (Tumor_Sample_Barcode or Unique_Patient_Identifier)

chain_file

a LiftOver chain file (text format, name ends in .chain) to convert MAF records to the genome build used in the CESAnalysis.

separate_old_problems

When TRUE (as used by load_maf), respect old problems that look like they came from cancereffectsizeR (typically from preload_maf). These get separated as "old_problem", and the records won't be checked. chain_file must be NULL.

Value

data.table with core MAF columns, any other requested columns, and a "problem" column


Townsend-Lab-Yale/cancereffectsizeR documentation built on April 28, 2024, 6:14 p.m.