Masks ragged leading and trailing edges of aligned DNA sequences

Share:

Description

maskSeqEnds takes a vector of DNA sequences, as character strings, and replaces the leading and trailing characters with "N" characters to create a sequence vector with uniformly masked outer sequence segments.

Usage

1
maskSeqEnds(seq, max_mask = NULL, trim = FALSE)

Arguments

seq

a character vector of DNA sequence strings.

max_mask

the maximum number of characters to mask. If set to 0 then no masking will be performed. If set to NULL then the upper masking bound will be automatically determined from the maximum number of observed leading or trailing "N" characters amongst all strings in seq.

trim

if TRUE leading and trailing characters will be cut rather than masked with "N" characters.

Value

A modified seq vector with masked (or optionally trimmed) sequences.

See Also

See maskSeqGaps for masking internal gaps.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Default behavior uniformly masks ragged ends
seq <- c("CCCCTGGG", "NAACTGGN", "NNNCTGNN")
maskSeqEnds(seq)

# Does nothing
maskSeqEnds(seq, max_mask=0)

# Cut ragged sequence ends
maskSeqEnds(seq, trim=TRUE)

# Set max_mask to limit extent of masking and trimming
maskSeqEnds(seq, max_mask=1)
maskSeqEnds(seq, max_mask=1, trim=TRUE)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.