disambiguate: Disambiguate a Nucleic Sequence
In spgs: Statistical Patterns in Genomic Sequences

disambiguate

R Documentation

Disambiguate a Nucleic Sequence

Description

Make a DNA/RNA sequence unambiguous by stripping out all symbols that do not uniquely specify nucleic acids. In other words, remove all symbols other than a's, c's, g's, t's or u's from the sequence.

Usage

## Default S3 method:
disambiguate(x, case=c("lower", "upper", "as is"), ...)
## S3 method for class 'SeqFastadna'
disambiguate(x, ...)
## S3 method for class 'list'
disambiguate(x, ...)

Arguments

`x`	A character vector, an object that can be coersed to a character vector or a list of objects that canbe be converted to character vectors. this argument can also be a `SeqFastadna` object provided by the seqinr package.
`case`	Determines how symbols in `x` should be treated before translating them into their complements. “`lower`”, the default behaviour, converts all symbols to lowercase while “`upper`” converts them to uppercase. “`as is`” allows the symbols to pass unchanged so that the case of each output symbol matches that of the corresponding input symbol.
`...`	Arguments to be passed from or to other functions.

Details

If x is a SeqFastadna object or a character vector in which each element is a single nucleobase, then it represents a single sequence. It will be made unambiguous and returned in the same form.

On the other hand, if x is a vector of character strings, each of which represents a nucleic sequence, then the result will bea a character vector in which each element contains the unambiguous sequence corresponding to the element in x as a character string.