comp: complements a nucleic acid sequence

View source: R/comp.R

compR Documentation

complements a nucleic acid sequence

Description

Complements a sequence, for instance if the sequence is "a","c","g","t" it returns "t","g","c","a". This is not the reverse complementary strand. This function can handle ambiguous bases if required.

Usage

comp(seq, forceToLower = TRUE, ambiguous = FALSE)

Arguments

seq

a DNA sequence as a vector of single chars

forceToLower

if TRUE characters in seq are forced to lower case

ambiguous

if TRUE ambiguous bases in seq are handled

Value

a vector of characters which is the complement of the sequence, not the reverse complementary strand. Undefined values are returned as NA.

Author(s)

D. Charif, J.R. Lobry

References

citation("seqinr")

See Also

Because ssDNA sequences are always written in the 5'->3' direction, use rev(comp(seq)) to get the reverse complementary strand (see rev).

Examples

##
## Show that comp() does *not* return the reverve complementary strand:
##

c2s(comp(s2c("aaaattttggggcccc")))

##
## Show how to get the reverse complementary strand:
##

c2s(rev(comp(s2c("aaaattttggggcccc"))))

##
## Show what happens with non allowed values:
##

c2s(rev(comp(s2c("aaaaXttttYggggZcccc"))))

##
## Show what happens with ambiguous bases:
##

allbases <- s2c("abcdghkmstvwn")
comp(allbases) # NA are produced
comp(allbases, ambiguous = TRUE) # No more NA

##
## Routine sanity checks:
##

stopifnot(identical(comp(allbases, ambiguous = TRUE), s2c("tvghcdmksabwn")))
stopifnot(identical(comp(c("A", "C", "G", "T"), forceToLower = FALSE), c("T", "G", "C", "A")))

seqinr documentation built on April 6, 2023, 1:10 a.m.