manipulate: Further bit-level manipulation of DNA and amino acid...
In insect: Informatic Sequence Classification Trees

manipulate

R Documentation

Further bit-level manipulation of DNA and amino acid sequences.

Description

These functions provide additional methods to manipulate objects of class "DNAbin" and "AAbin" to supplement those available in the ape package.

Usage

## S3 method for class 'DNAbin'
duplicated(x, incomparables = FALSE, pointers = TRUE, ...)

## S3 method for class 'DNAbin'
unique(x, incomparables = FALSE, attrs = TRUE, drop = FALSE, ...)

## S3 method for class 'DNAbin'
subset(x, subset, attrs = TRUE, drop = FALSE, ...)

## S3 method for class 'AAbin'
duplicated(x, incomparables = FALSE, pointers = TRUE, ...)

## S3 method for class 'AAbin'
unique(x, incomparables = FALSE, attrs = TRUE, drop = FALSE, ...)

## S3 method for class 'AAbin'
subset(x, subset, attrs = TRUE, drop = FALSE, ...)

Arguments

`x`	a `"DNAbin"` or `"AAbin"` object.
`incomparables`	placeholder, not currently functional.
`pointers`	logical indicating whether the re-replication index key should be returned as a `"pointers"` attribute of the output vector (only applicable for `duplicated.DNAbin` and `duplicated.AAbin`). Note that this can increase the computational time for larger sequence lists.
`...`	further arguments to be passed between methods.
`attrs`	logical indicating whether the attributes of the input object whose length match the object length (or number of rows if it is a matrix) should be retained and subsetted along with the object. This is useful if the input object has species, lineage and/or taxon ID metadata that need to be retained following the duplicate analysis.
`drop`	logical; indicates whether the input matrix (assuming one is passed) should be reduced to a vector if subset to a single sequence. Defaults to FALSE in keeping with the style of the `ape` package functions.
`subset`	logical vector giving the elements or rows to be kept.

Value

unique and subset return a DNAbin or AAbin object. duplicated returns a logical vector.

Author(s)

Shaun Wilkinson

References

Paradis E, Claude J, Strimmer K, (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289-290.

Paradis E (2007) A bit-level coding scheme for nucleotides. https://emmanuelparadis.github.io/misc/BitLevelCodingScheme_20April2007.pdf.

Paradis E (2012) Analysis of Phylogenetics and Evolution with R (Second Edition). Springer, New York.

Examples

  data(whales)
  duplicates <- duplicated.DNAbin(whales, point = TRUE)
  attr(duplicates, "pointers")
  ## returned indices show that the last sequence is
  ## identical to the second one.
  ## subset the reference sequence database to only include unques
  whales <- subset.DNAbin(whales, subset = !duplicates)
  ## this gives the same result as
  unique.DNAbin(whales)

insect documentation built on June 8, 2025, 10:37 a.m.