solveAmbiguousBases: Solve Ambiguous Bases in DNA Sequences

View source: R/DNA.R

solveAmbiguousBasesR Documentation

Solve Ambiguous Bases in DNA Sequences

Description

Replaces ambiguous bases in DNA sequences (R, Y, W, ...) by A, G, C, or T.

Usage

solveAmbiguousBases(x, method = "columnwise", random = TRUE)

Arguments

x

a matrix of class "DNAbin"; a list is accepted and is converted into a matrix.

method

the method used (no other choice than the default for the moment; see details).

random

a logical value (see details).

Details

The replacements of ambiguous bases are done columwise. First, the base frequencies are counted: if no ambiguous base is found in the column, nothing is done. By default (i.e., if random = TRUE), the replacements are done by random sampling using the frequencies of the observed compatible, non-ambiguous bases. For instance, if the ambiguous base is Y, it is replaced by either C or T using their observed frequencies as probabilities. If random = FALSE, the greatest of these frequencies is used. If there are no compatible bases in the column, equal probabilities are used. For instance, if the ambiguous base is R, and only C and T are observed, then it is replaced by either A or G with equal probabilities.

Alignment gaps are not changed; see the function latag2n to change the leading and trailing gaps.

Value

a matrix of class "DNAbin".

Author(s)

Emmanuel Paradis

See Also

base.freq, latag2n, dnds

Examples

X <- as.DNAbin(matrix(c("A", "G", "G", "R"), ncol = 1))
alview(solveAmbiguousBases(X)) # R replaced by either A or G
alview(solveAmbiguousBases(X, random = FALSE)) # R always replaced by G

ape documentation built on May 29, 2024, 10:50 a.m.