buildref: buildref

View source: R/buildref.R

buildrefR Documentation

buildref

Description

Function to build a RefCDS object from a reference genome and a table of transcripts. The RefCDS object has to be precomputed for any new species or assembly prior to running dndscv. This function generates an .rda file that needs to be input into dndscv using the refdb argument. Note that when multiple CDS share the same gene name (second column of cdsfile), the longest coding CDS will be chosen for the gene. CDS with ambiguous bases (N) will not be considered.

Usage

buildref(
  cdsfile,
  genomefile,
  outfile = "RefCDS.rda",
  numcode = 1,
  excludechrs = NULL,
  onlychrs = NULL,
  useids = F
)

Arguments

cdsfile

Path to the reference transcript table.

genomefile

Path to the indexed reference genome file.

outfile

Output file name (default = "RefCDS.rda").

numcode

NCBI genetic code number (default = 1; standard genetic code). To see the list of genetic codes supported use: ? seqinr::translate

excludechrs

Vector or string with chromosome names to be excluded from the RefCDS object (default: no chromosome will be excluded). The mitochondrial chromosome should be excluded as it has different genetic code and mutation rates, either using the excludechrs argument or not including mitochondrial transcripts in cdsfile.

onlychrs

Vector of valid chromosome names (default: all chromosomes will be included)

useids

Combine gene IDs and gene names (columns 1 and 2 of the input table) as long gene names (default = F)

Details

Martincorena I, et al. (2017) Universal patterns of selection in cancer and somatic tissues. Cell. 171(5):1029-1041.

Author(s)

Inigo Martincorena (Wellcome Sanger Institute)


im3sanger/dndscv documentation built on Oct. 1, 2023, 1:05 p.m.