buildref | R Documentation |
Function to build a RefCDS object from a reference genome and a table of transcripts. The RefCDS object has to be precomputed for any new species or assembly prior to running dndscv. This function generates an .rda file that needs to be input into dndscv using the refdb argument. Note that when multiple CDS share the same gene name (second column of cdsfile), the longest coding CDS will be chosen for the gene. CDS with ambiguous bases (N) will not be considered.
buildref(
cdsfile,
genomefile,
outfile = "RefCDS.rda",
numcode = 1,
excludechrs = NULL,
onlychrs = NULL,
useids = F
)
cdsfile |
Path to the reference transcript table. |
genomefile |
Path to the indexed reference genome file. |
outfile |
Output file name (default = "RefCDS.rda"). |
numcode |
NCBI genetic code number (default = 1; standard genetic code). To see the list of genetic codes supported use: ? seqinr::translate |
excludechrs |
Vector or string with chromosome names to be excluded from the RefCDS object (default: no chromosome will be excluded). The mitochondrial chromosome should be excluded as it has different genetic code and mutation rates, either using the excludechrs argument or not including mitochondrial transcripts in cdsfile. |
onlychrs |
Vector of valid chromosome names (default: all chromosomes will be included) |
useids |
Combine gene IDs and gene names (columns 1 and 2 of the input table) as long gene names (default = F) |
Martincorena I, et al. (2017) Universal patterns of selection in cancer and somatic tissues. Cell. 171(5):1029-1041.
Inigo Martincorena (Wellcome Sanger Institute)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.