msaPrettyPrint: Pretty-Printing of Multiple Sequence Alignments

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/msaPrettyPrint.R

Description

The msaPrettyPrint function provides an R interface to the powerful LaTeX package texshade.sty which allows for a highly customizable plots of multiple sequence alignments.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
    msaPrettyPrint(x, y, output=c("pdf", "tex", "dvi", "asis"),
                   subset=NULL, file=NULL, alFile=NULL,
                   askForOverwrite=TRUE,  psFonts=FALSE, code=NA,
                   paperWidth=11, paperHeight=8.5, margins=c(0.1, 0.3),
                   shadingMode=c("identical", "similar", "functional"),
                   shadingModeArg=NA,
                   shadingColors=c("blues", "reds", "greens", "grays",
                                   "black"),
                   showConsensus=c("bottom", "top", "none"),
                   consensusColors=c("ColdHot", "HotCold", "BlueRed",
                                     "RedBlue", "GreenRed",
                                     "RedGreen", "Gray"),
                   consensusThreshold=50,
                   showLogo=c("top", "bottom", "none"),
                   logoColors=c("chemical", "rasmol", "hydropathy",
                                "structure", "standard area",
                                "accessible area"),
                   showLogoScale=c("none", "leftright", "left",
                                   "right"),
                   showNames=c("left", "right", "none"),
                   showNumbering=c("right", "left", "none"),
                   showLegend=TRUE, furtherCode=NA, verbose=FALSE)

Arguments

x

an object of class MultipleAlignment, which includes the classes MsaAAMultipleAlignment, MsaDNAMultipleAlignment, and MsaRNAMultipleAlignment.

y

argument for restricting the output to a subset of columns; can be a numeric vector of length 2 with a lower and an upper bound or an object of class IRanges. If missing, the entire multiple alignment is printed.

output

type of output to be generated (see details below)

subset

can be used to specify a subset of sequences in the multiple alignment x if not all sequences should be printed.

file

name of output file; if no name is given, the name of the output file defaults to name of the object provided as argument x along with the proper suffix which depends on the type of output specified with the output argument. Note that this might lead to invalid file names if not the name of an object, but an R expression is passed as argument x.

alFile

name of alignment file to be created; msaPrettyPrint first writes the multiple alignment x to a .fasta file. The name of this file can be determined with the alFile argument. If no name is given, the name of the output file defaults to name of the object provided as argument x along with the suffix .fasta. Note that this might lead to invalid file names if not the name of an object, but an R expression is passed as argument x.

askForOverwrite

if TRUE (default), msaPrettyPrint asks whether existing files should be overwritten or not. If askForOverwrite is set to FALSE, files are overwritten without further notice.

psFonts

if TRUE, msaPrettyPrint produces LaTeX code that includes the LaTeX package times.sty; if FALSE, msaPrettyPrint produces LaTeX code based on the standard LaTeX fonts (default). Ignored for output="asis".

code

this argument can be used to specify the entire LaTeX code in the texshade environment. This overrides all arguments that customize the appearance of the output. Instead, all customizations must be done as LaTeX commands provided by the package texshade.sty directly. This option should only be used by expert users and for special applciations in which the possibilities of the customizations of the msaPrettyPrint function turn out to be insufficient.

paperWidth,paperHeight

paper format to be used in the resulting document; defaults to 11in x 8.5in (US letter in landscape orientation). Ignored for output="asis".

margins

a numeric vector of length 2 with the horizontal and vertical margins, respectively; the default is 0.1in for the horizontal and 0.3in for the vertical margin.

shadingMode

shading mode; currently the shading modes "identical", "similar", and "functional" are supported (see documentation of texshade.sty for details).

shadingModeArg

for shading modes "identical" and "similar", shadingModeArg must be a single numeric threshold between 0 and 100 or two thresholds between 0 and 100 in increasing order. For shading mode "functional", valid shadingModeArg arguments are "charge", "hydropathy", "structure", "chemical", "rasmol", "standard area", and "accessible area" (see documentation of texshade.sty for details).

shadingColors

color scheme for shading; valid "shadingColors" arguments are "blues", "reds", "greens", "grays", and "black" (see documentation of texshade.sty for details).

showConsensus

where to show the consensus sequence; possible values are "bottom", "top", and "none" (the latter option suppresses printing of the consensus sequence).

consensusColors

color scheme for printing the consensus sequence; the following choices are possible: "ColdHot", "HotCold", "BlueRed", "RedBlue", "GreenRed", "RedGreen", and "Gray" (see documentation of texshade.sty for details).

consensusThreshold

one or two numbers between 0 and 100, where the second one is optional and must be larger than the first one (see documentation of texshade.sty for details)

showLogo

where to show a sequence logo; possible values are "top", "bottom", or "none" (the latter option suppresses printing of the consensus sequence). If a sequence logo and a consensus sequence should be shown together, they can only be located at opposite sides.

logoColors

color scheme for printing the sequence logo; the following choices are possible: "chemical", "rasmol", "hydropathy", "structure", "standard area", and "accessible area" (see documentation of texshade.sty for details).

showLogoScale

where to plot the vertical axis of the sequence logo; possible values are "left", "right", "leftright", and "none" (the latter option suppresses that the axis is displayed).

showNames

where to print sequence names; possible values are "left", "right", and "none" (the latter option suppresses that names are displayed).

showNumbering

where to print sequence numbers; possible values are "left", "right", and "none" (the latter option suppresses that numbers are displayed). If sequence names and numbers should be shown together, they can only be located at opposite sides.

showLegend

if TRUE (default), a legend is printed at the end of the alignment.

furtherCode

additional LaTeX code to be included in the texshade environment; all text passed as furtherCode is placed between the commands created by msaPrettyPrint and the end of the texshade environment. Note the difference to the code argument: while the code argument replaces all LaTeX code in the texshade environment, the code passed as furtherCode argument is added to the LaTeX code in the texshade environment.

verbose

if TRUE (default), progress messages are printed and also the output of running (PDF)LaTeX (if applicable) is printed to the R session.

Details

The msaPrettyPrint function writes a multiple alignment to a .fasta file and creates LaTeX code for pretty-printing the multiple alignment on the basis of the LaTeX package texshade.sty. If output="asis", msaPrettyPrint prints a LaTeX fragment consisting of the texshade environment to the console. The parameters described above can be used to customize the way the multiple alignment is formatted. If output="tex", a complete LaTeX file including preamble is created. For output="dvi" and output="pdf", the same kind of LaTeX file is created, but processed using (PDF)LaTeX to produce a final DVI or PDF file, respectively. The file argument be used to determine the file name of the final output file (except for the output="asis" which does not create an output file).

The choice output="asis" is particularly useful for Sweave or knitr documents. If msaPrettyPrint is called with output="asis" in a code chunk with results="tex" (Sweave) or results="asis" (knitr), then the resulting LaTeX fragment consisting of the texshade environment is directly included in the LaTeX document that is created from the Sweave/knitr document.

As noted above, if they are not specified explicitly, output file names are determined automatically. It is important to point out that all file names need to be LaTeX-compliant, i.e. no special characters and spaces (!) are allowed. If a file name would be invalid, msaPrettyPrint makes a default choice.

Moreover, if sequence names are to be printed, there might be names that are not LaTeX-compliant and lead to LaTeX errors. In order to check that in advance, the function msaCheckNames is available.

Note that texi2dvi and texi2pdf always save the resulting DVI/PDF files to the current working directory, even if the LaTeX source file is in a different directory. That is also the reason why the temporary file is created in the current working directory in the example below.

Value

msaPrettyPrint returns an invisible character vector consisting of the LaTeX fragment with the texshade environment.

Author(s)

Ulrich Bodenhofer, Enrico Bonatesta, and Christoph Horejs-Kainrath <msa@bioinf.jku.at>

References

http://www.bioinf.jku.at/software/msa

U. Bodenhofer, E. Bonatesta, C. Horejs-Kainrath, and S. Hochreiter (2015). msa: an R package for multiple sequence alignment. Bioinformatics 31(24):3997-3999. DOI: 10.1093/bioinformatics/btv494.

https://www.ctan.org/pkg/texshade

Beitz, E. (2000) TeXshade: shading and labeling of multiple sequence alignments using LaTeX2e Bioinformatics 16(2):135-139. DOI: 10.1093/bioinformatics/16.2.135.

See Also

msaCheckNames

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
## read sequences
filepath <- system.file("examples", "exampleAA.fasta", package="msa")
mySeqs <- readAAStringSet(filepath)

## call unified interface msa() for default method (ClustalW) and
## default parameters
myAlignment <- msa(mySeqs)

## show resulting LaTeX code with default settings
msaPrettyPrint(myAlignment, output="asis", askForOverwrite=FALSE)

## create PDF file according to some custom settings
tmpFile <- tempfile(pattern="msa", tmpdir=".", fileext=".pdf")
tmpFile
msaPrettyPrint(myAlignment, file=tmpFile, output="pdf",
               showNames="left", showNumbering="none", showLogo="top",
               showConsensus="bottom", logoColors="rasmol",
               verbose=FALSE, askForOverwrite=FALSE)

## Not run: 
library(Biobase)
openPDF(tmpFile)
## End(Not run)

Example output

Loading required package: Biostrings
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package:BiocGenericsThe following objects are masked frompackage:parallel:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked frompackage:stats:

    IQR, mad, sd, var, xtabs

The following objects are masked frompackage:base:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors
Loading required package: stats4

Attaching package:S4VectorsThe following object is masked frompackage:base:

    expand.grid

Loading required package: IRanges
Loading required package: XVector

Attaching package:BiostringsThe following object is masked frompackage:base:

    strsplit

use default substitution matrix
\begin{texshade}{/work/tmp/tmp/Rtmp4Xp47A/seq18cdcd3c5d0135.fasta}
\seqtype{P}
\shadingmode{identical}
\threshold{50}
\showconsensus[ColdHot]{bottom}
\shadingcolors{blues}
\showsequencelogo[chemical]{top}
\hidelogoscale
\shownames{left}
\nameseq{1}{PH4H Rattus norvegicus}
\nameseq{2}{PH4H Mus musculus}
\nameseq{3}{PH4H Homo sapiens}
\nameseq{4}{PH4H Bos taurus}
\nameseq{5}{PH4H Chromobacterium violaceum}
\nameseq{6}{PH4H Ralstonia solanacearum}
\nameseq{7}{PH4H Caulobacter crescentus}
\nameseq{8}{PH4H Pseudomonas aeruginosa}
\nameseq{9}{PH4H Rhizobium loti}
\shownumbering{right}
\showlegend
\end{texshade}
[1] "./msa18cdcd4944fb12.pdf"
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

[1] TRUE

msa documentation built on Nov. 8, 2020, 5:41 p.m.