vcf2spec: Construct the spectrum from a VCF file

View source: R/preprocess.R

vcf2specR Documentation

Construct the spectrum from a VCF file

Description

The function converts a vcf file into a spectrum matrix which has mutation counts catalogued by contexts. It uses the fact that DNA has two complementary strands and folds the A/G mutations into to T/C.

It is a wrapper on bedtools(https://github.com/arq5x/bedtools2/releases).

Usage

vcf2spec(
  bedtools_path = "bedtools",
  vcf_meta,
  ref_genome,
  output_file,
  context_length = 1,
  overwrite = F
)

Arguments

bedtools_path

Path of bedtool executable, e.g. "~/bedtools/bin/bedtool"

vcf_meta

A comma(CSV)/space/tab(TAB)-delimited meta vcf file in the format of "vcf_path[comma, space or tab]sample_name", Alternatively, you can just supply a list of paths to vcfs without sample_name. It will then use vcf file names as sample name.:

ref_genome

An fa file path of reference genome sequence

output_file

An output file that contains the context matrix

context_length

The length of extension of each side of the mutation. By default it is 3 (trinucleotide context).

overwrite

Should it overwrite the output_file if existed

Value

Returns a matrix of mutation counts in different context, in each of the samples


gersteinlab/siglasso documentation built on Sept. 5, 2022, 8:45 p.m.