EmptyCells: Identify/Delete Spurious Rows and Columns from DNA Alignments

EmptyCellsR Documentation

Identify/Delete Spurious Rows and Columns from DNA Alignments

Description

After subsetting (see e.g. DNAbin), DNA sequence alignments can contain rows and columns that consist entirely of missing and/or ambiguous character states. identifyEmptyCells will identify and deleteEmptyCells will delete all such rows (taxa) and columns (characters) from a DNA sequence alignment.

Usage

deleteEmptyCells(
  DNAbin,
  margin = c(1, 2),
  nset = c("-", "n", "?"),
  quiet = FALSE
)

identifyEmptyCells(
  DNAbin,
  margin = c(1, 2),
  nset = c("-", "n", "?"),
  quiet = FALSE
)

Arguments

DNAbin

An object of class DNAbin.

margin

A vector giving the subscripts the function will be applied over: 1 indicates rows, 2 indicates columns, and c(1, 2) indicates rows and columns.

nset

A vector of mode character; rows or columns that consist only of the characters given in nset will be deleted from the alignment. Allowed are "-", "?","n", "b", "d","h", "v", "r","y", "s", "w","k", and "m".

quiet

Logical: if set to TRUE, screen output will be suppressed.

Details

For faster execution, deleteEmptyCells handles sequences in ape's bit-level coding scheme.

Value

An object of class DNAbin.

References

Cornish-Bowden, A. 1984. Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984. Nucleic Acids Res. 13: 3021–3030.

See Also

trimEnds, deleteGaps

Examples

  # COX1 sequences of bark beetles
  data(ips.cox1)
  # introduce completely ambiguous rows and colums
  x <- as.character(ips.cox1[1:6, 1:60])
  x[3, ] <- rep("n", 60)
  x[, 20:24] <- rep("-", 6)
  x <- as.DNAbin(x)
  image(x)
  # identify those rows and colums
  (id <- identifyEmptyCells(x))
  xx <- x[-id$row, -id$col]
  # delete those rows and colums
  x <- deleteEmptyCells(x)
  image(x)
  identical(x, xx)

heibl/ips documentation built on April 24, 2024, 3:19 a.m.