View Sequences in a Web Browser

Share:

Description

Opens an html file in a web browser to show the sequences in an XStringSet.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
BrowseSeqs(myXStringSet,
           htmlFile = paste(tempdir(), "/myXStringSet.html", sep = ""),
           openURL = interactive(),
           colorPatterns = TRUE,
           highlight = NA,
           patterns = c("-", alphabet(myXStringSet, baseOnly=TRUE)),
           colors = substring(rainbow(length(patterns),
                              v=0.8, start=0.9, end=0.7), 1, 7),
           colWidth = Inf,
           ...)

Arguments

myXStringSet

A XStringSet object of sequences.

htmlFile

Character string giving the location where the html file should be written.

openURL

Logical indicating whether the htmlFile should be opened in a web browser.

colorPatterns

Logical specifying whether to color matched patterns, or an integer vector providing pairs of start and stop boundaries for coloring.

highlight

Numeric specifying which sequence in the set to use for comparison or NA to color all sequences (default). If highlight is 0 then positions differing from the consensus sequence are highlighted.

patterns

Either an AAStringSet, DNAStringSet, or RNAStringSet object, or a character vector containing regular expressions to be colored in the XStringSet. Regular expressions are searched sequentially with multiple matches allowed, even within other previously matched patterns. (See details section below.)

colors

Character vector providing the color for each of the matched patterns. Typically a character vector with elements of 7 characters: “#” followed by the red, blue, green values in hexadecimal (after rescaling to 0 ... 255).

colWidth

Integer giving the maximum number of nucleotides wide the display can be before starting a new page. Must be a multiple of 20 (e.g., 100), or Inf (the default) to display all the sequences in one set of rows.

...

Additional arguments to adjust the appearance of the consensus sequence at the base of the display. Passed directly to ConsensusSequence for an AAStringSet, DNAStringSet, or RNAStringSet, or to consensusString for a BStringSet.

Details

BrowseSeqs converts an XStringSet into html format for viewing in a web browser. If patterns are supplied then they are matched as regular expressions, and colored according to colors. Some web browsers cannot quickly display a large amount colored text, so it is recommended to use color = FALSE or to highlight a sequence when viewing a large XStringSet. Highlighting will only show all of the characters in the highlighted sequence, and convert all matching positions in the other sequences into dots without color.

Patterns are not matched across column breaks, so multi-character patterns should be carefully considered when colWidth is less than the maximum sequence length. Patterns are matched sequentially in the order provided, so it is feasible to use nested patterns such as c("ACCTG", "CC"). In this case the “CC” could be colored differently inside the previously colored “ACCTG”. Note that patterns overlapping the boundaries of a previously matched pattern will not be matched. For example, “ACCTG” would not be matched if patterns=c("CC", "ACCTG").

Value

Creates an html file containing sequence data and (if openURL is TRUE) opens it in a web browser for viewing. The layout has the sequence name on the left, position legend on the top, cumulative number of nucleotides on the right, and consensus sequence on the bottom.

Returns htmlFile if the html file was written successfully.

Author(s)

Erik Wright DECIPHER@cae.wisc.edu

References

ES Wright (2016) "Using DECIPHER v2.0 to Analyze Big Biological Sequence Data in R". The R Journal, 8(1), 352-359.

See Also

BrowseDB, ConsensusSequence

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
db <- system.file("extdata", "Bacteria_175seqs.sqlite", package="DECIPHER")
dna <- SearchDB(db)
BrowseSeqs(dna)
BrowseSeqs(dna, colWidth=100, highlight=1)

# color bases in alternating groups with a different color scheme
BrowseSeqs(dna[1:5],
	colorPatterns=seq(1, width(dna)[1], 10),
	patterns=c("A", "C", "G", "T", "-"),
	colors=c("#1E90FF", "#32CD32", "#9400D3", "#000000", "#EE3300"))

# color all restriction sites
data(RESTRICTION_ENZYMES)
sites <- RESTRICTION_ENZYMES
sites <- gsub("[^A-Z]", "", sites) # remove non-letters
sites <- DNAStringSet(sites)
rc_sites <- DNAStringSet(sites)
w <- which(sites != rc_sites)
sites <- c(sites, rc_sites[w])
sites <- sites[order(nchar(sites))] # match shorter sites first

dna <- SearchDB(db, remove="all") # unaligned sequences
BrowseSeqs(dna, patterns=sites)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.