RM2: Regression models for localised mutations: Evaluating...

View source: R/running_glm.R

RM2R Documentation

Regression models for localised mutations: Evaluating differential mutation rates across classes of sites

Description

RM2() uses negative binomial regression to evaluate local mutational frequencies and processes between sites of the same class to flanking control regions

Usage

RM2(
  maf,
  sites,
  mut_class_columns = NA,
  cofactor_column = NA,
  window_size = 100,
  n_min_mut = 100,
  n_bin = 10
)

Arguments

maf

Data frame of mutations (prepared by get_mut_trinuc_strand) containing the following information:

chr

autosomal chromosomes as chr1 to chr22 and sex chromosomes as chrX and chrY

start

the start position of the mutation in base 1 coordinates

end

the end position of the mutation in base 1 coordinates

ref

the reference allele as a string containing the bases A, T, C or G

alt

the alternate allele as a string containing the bases A, T, C or G

mut_trinuc

trinucleotide context - where middle is C or T - with alternate allele

mut_strand

character indicating Watson (w) or Crick (c)

ref_alt

character indicating single-base substitution

sites

Data frame of site coordinates

chr

autosomal chromosomes as chr1 to chr22 and sex chromosomes as chrX and chrY

start

the start position of the mutation in base 0 coordinates

end

the end position of the mutation in base 0 coordinates

mut_class_columns

Character corresponding to the column(s) of mutation classes for grouped analysis

cofactor_column

Character corresponding to the column of binary cofactors

window_size

Integer indicating the half-width of sites and flanking regions (added to left and right for full width). (default 100)

n_min_mut

Integer indicating the minimum number of mutations required to perform analysis (default 100)

n_bin

Integer indicating the number of megabase bins to use (default 10)

Value

Data frame containing the regression estimates and likelihood ratio test output with the following columns: mut_type, pp, this_coef, obs_mut, exp_mut, exp_mut_lo, exp_mut_hi, fc, n_sites_tested

mut_type

A string identifying the mutation class

pp

The p-value from the likelihood ratio test

this_coef

The coefficient from is_site

obs_mut

The total number of observed mutations of that class

exp_mut

The expected number of mutations determined by the model

exp_mut_lo

Lower bound of 95% confidence interval

exp_mut_hi

Upper bound of 95% confidence interval

fc

Observed mutations divided by expected mutations

pp_cofac

The p-value from the likelihood ratio test of site:cofactor interaction

this_coef_cofac

The coefficient from the site:cofactor interaction term

n_sites_tested

The number of sites that were tested - all sites if no downsampling


reimandlab/RM2 documentation built on Aug. 13, 2022, 12:22 p.m.