get_coord: DNA base/codon sequence and coordinates represented on a...

get_coordR Documentation

DNA base/codon sequence and coordinates represented on a given Abelian group.

Description

Given a string denoting a codon or base from the DNA (or RNA) alphabet and a genetic-code Abelian group as given in reference (1), this function returns an object from CodonGroup-class carrying the DNA base/codon sequence and coordinates represented on the given Abelian group.

Usage

get_coord(x, ...)

## S4 method for signature 'BaseGroup_OR_CodonGroup'
get_coord(x, output = c("all", "matrix.list"))

## S4 method for signature 'DNAStringSet_OR_NULL'
get_coord(
  x,
  output = c("all", "matrix.list"),
  base_seq = TRUE,
  filepath = NULL,
  cube = "ACGT",
  group = "Z4",
  start = NA,
  end = NA,
  chr = 1L,
  strand = "+"
)

Arguments

x

An object from a BaseGroup-class, CodonGroup-class, DNAStringSet or DNAMultipleAlignment class carrying the DNA pairwise alignment of two sequences. Objects from BaseGroup-class and CodonGroup-class are generated with functions: base_coord and codon_coord, respectively.

...

Not in use.

output

See 'Value' section.

base_seq

Logical. Whether to return the base or codon coordinates on the selected Abelian group. If codon coordinates are requested, then the number of the DNA bases in the given sequences must be multiple of 3.

filepath

A character vector containing the path to a file in fasta format to be read. This argument must be given if codon & base arguments are not provided.

cube

A character string denoting one of the 24 Genetic-code cubes, as given in references (2 2 3).

group

A character string denoting the group representation for the given base or codon as shown in reference (1).

start, end, chr, strand

Optional parameters required to build a GRanges-class. If not provided the default values given for the function definition will be used.

Details

Symbols '-' and 'N' usually found in DNA sequence alignments to denote gaps and missing/unknown bases are represented by the number: '-1' on Z4 and '0' in Z5. In Z64 the symbol 'NA' will be returned for codons including symbols '-' and 'N'.

Although the CodonGroup-class object returned by functions codon_coord and base_coord are useful to store genomic information, the base and codon coordinates are not given on them as numeric magnitudes. Function get_coord provides the way to get the coordinates in a numeric object in object from and still to preserve the base/codon sequence information.

Value

An object from CodonGroup-class class is returned when output = 'all'. This has two slots, the first one carrying a list of matrices and the second one carrying the codon/base sequence information. That is, if x is an object from CodonGroup-class class, then a list of matrices of codon coordinate can be retrieved as x@CoordList and the information on the codon sequence as x@SeqRanges.

if output = 'matrix.list', then an object from MatrixList class is returned.

Examples

## Load a pairwise alignment
data("aln", package = "GenomAutomorphism")
aln

## DNA base representation in the Abelian group Z5
coord <- get_coord(
    x = aln,
    cube = "ACGT",
    group = "Z5"
)
coord ## A list of vectors

## Extract the coordinate list
coordList(coord)

## Extract the sequence list
seqRanges(coord)

## DNA codon representation in the Abelian group Z64
coord <- get_coord(
    x = aln,
    base_seq = FALSE,
    cube = "ACGT",
    group = "Z64"
)
coord

## Extract the coordinate list
coordList(coord)

## Extract the sequence list
seqRanges(coord)


genomaths/GenomAutomorphism documentation built on Jan. 2, 2025, 12:25 a.m.