RweaveRtf: An Sweave driver for rich text format (RTF) documents

View source: R/rtfSweaveDriver.R

RweaveRtfR Documentation

An Sweave driver for rich text format (RTF) documents

Description

An add-on driver for Sweave that translates R code chunks in RTF files to form an RTF output document

Usage

RweaveRtf()
RtangleRtf()
SweaveSyntaxRtf


Details

By default Sweave uses the RweaveLatex driver to convert ‘*.Rnw’ input files to LaTeX output files. However, Sweave was designed to allow authors to write their own drivers. The rtfSweave drivers documented here are essentially a copies of the RweaveLatex and Rtangle drivers with some necessary changes to make them work for rich text format (RTF) documents.

RTF documents are plain text documents that have some similarities to LaTeX. For one thing, RTF documents can be opened and edited using Emacs or another text editor. While the RTF files written by Microsoft Word are difficult to read, without loss of functionality one can write very human-readable RTF documents which can be opened in Microsoft Word and then saved as a native ‘*.docx’ or ‘*.doc’ file. One can write RTF documents with headings, subheadings, headers, footers, complex tables, figures, etc. It may be helpful to think of RTF as a text-based interface to nearly all of Microsoft Word's document formatting functionality.

My motivation for creating RTF drivers for Sweave was that my non-statistician colleagues, collaborators, and clients almost exclusively use Microsoft Word. Rather than converting a LaTeX report created with Sweave to RTF, I prefer to write the RTF myself to have full control over the output format.

Options for RweaveRtf

RweaveRtf supports the following options for code chunks. If the option is used in the header of a code chunk (i.e., between << and >>=) then options do not need to be quoted. Options can also be passed as arguments to Sweave and in this case character string values should be quoted.

For options taking a logical value, Sweave understands upper or lowercase versions of TRUE and FALSE when the option is used in a code chunk header.

Most often used options

These are the most often used options that most users need to understand.

eval

logical defaulting to TRUE. If FALSE, the code chunk is not evaluated, and hence no text nor graphical output is produced. Code chunks that are not evaluated are still parsed so the chunk needs to contain valid R code. To have a code code chunk ignored entirely by Sweave set the engine option (described below) to something other than "R" or "S". For example, <<engine = foo>>= will cause Sweave to skip the whole chunk altogether.

echo

logical defaulting to TRUE which means R code in a code chunk is “echoed” to the output.

results

character string defaulting to "verbatim" which means show the results of the R commands as R code. This means output appears in a fixed-width font with whitespace and line breaks preserved. Using "rtf" means the output is taken to be proper RTF markup and included as is. (This is useful for having R output RTF-formatted tables which have complicated mark-up that is hard to write by hand.) Using "hide" means all output from the code chunk is completely suppressed (but the code is still executed by Sweave). Values can be abbreviated.

fig

logical defaulting to FALSE. Indicates whether any plots crated in the code chunk produce graphical output. Note that only one figure per code chunk can be processed this way. The labels for figure chunks are used as part of the file names, so should preferably be alphanumeric.

png

logical defaulting to TRUE which means a PNG figure should be generated when fig = TRUE. RTF allows embedding PNG, JPEG, and Windows enhanced metafiles (WMF).

jpeg

logical defaulting to FALSE. Set to TRUE to generate JPEGs when fig = TRUE. However, to use JPEGs rather than PNGs, one needs to also have png = FALSE because PNGs have priority.

tiff

logical defaulting to FALSE. The RTF format does not support TIFF figures but this option allows Sweave to write out a TIFF version of a figure. A possible use case is when using rtfSweave to output RTF files with embedded PNGs to form the basis of a manuscript. After the manuscript is revised and accepted by a journal, the journal often requires high-resolution TIFFs and these can be “turned on” with something like tiff = TRUE, tiff.resolution = 1200.

wmf

logical defaulting to FALSE. Set to TRUE for enhanced Windows metafiles generated by win.metafile (which is only available on Windows). To get WMF files embedded in the RTF file one needs to have png = FALSE and wmf = TRUE because PNGs have priority.

height

height of the figure in inches with a default of 6.

width

width of the figure in inches with a default of 6.

pointsize

pointsize for figures defaulting to 12.

resolution

resolution of the figure with a default of 300 dpi. Note the name of the option here is the full-length word resolution and does not match the argument in png, jpeg, etc. which use the shorter res.

tiff.resolution

resolution of the TIFF figures with a default of 300 dpi. The reason that TIFFs have their own resolution argument is to allow the author to have lower resolution images in the RTF output file while writing out higher-resolution TIFFs.

tiff.compression

type of compression for TIFF figures defaulting to "lzw". See tiff for the types of compression available. If one is using TIFFs for uploading to a journal website and the TIFF is not accepted by the uploader, it may be worth trying compression = "zip" which may be more “Adobe like”. (And if this is specified as an option in a code chunk header, omit the quotes.)

hex

logical defaulting to TRUE. This controls whether the PNG, JPEG, or WMF figure is embedded into the RTF document as hexadecimal characters. The default of TRUE means the RTF file is “self-contained.” If FALSE then the output file “includes” the file with an “INCLUDEPICTURE” RTF field instruction. This produces small RTF files since the figures are not embedded but it makes the document harder to convert correctly to Word's ‘*.docx’ format using “Save As”.

prefix

logical defaulting to TRUE. If TRUE then generated filenames of figures and output all have the common prefix given by the prefix.string option; otherwise only chunks without a label use the prefix.

prefix.string

a character string defaulting to the name of the source file (without extension). Note that prefix.string is used as part of filenames, so needs to be “portable” which generally means avoiding spaces and punctuation other than - and _.

Options controlling formatting for code chunk input and output

rtf.Schunk

RTF paragraph formatting commands for the R input and output. The default is Sweave like. Since the R input and output are part of the same RTF paragraph (with lines broken by \line, i.e., hard linebreaks), this options controls formatting for both input and output. A good resources for RTF mark-up is the RTF 1.5 specification document posted at http://www.biblioscape.com/rtf15_spec.htm. In particular, the section Paragraph Formatting Properties is useful to change the look of the output.

rtf.Sinput

RTF text formatting commands for R input. A good resources is Character Text from the RTF 1.5 specification.

rtf.Soutput

RTF text formatting commands for R output.

Less often used options

print

logical defaulting to FALSE. If TRUE, this forces auto-printing of all expressions. For example, if print = TRUE then the output for a single line like this a <- runif(1); b <- runif(1); a + b would result in printing the value of a, the value of b and their sum. The default behavior just prints the sum.

grdevice

character defaulting to NULL. This option allows the user to use custom graphics devices for figures. See the section called ‘Custom Graphics Devices’ in RweaveLatex.

term

logical defaulting to TRUE. If TRUE, visibility of values emulates an interactive R session: values of assignments are not printed, values of single objects are printed. If FALSE, output comes only from explicit print or similar statements.

keep.source

logical defaulting to TRUE which means when echo = TRUE the original source code is copied to the file with formatting unchanged. Otherwise, the code is read in by R and then output using R's default formatting (i.e., it is deparsed).

split

logical defaulting to false. If TRUE, text output is written to separate files for each code chunk.

strip.white

character string defaulting to "true" which means blank lines before and after output are removed. If "all", then all blank lines are removed from the output. If "false" then blank lines in the output are retained.

include

logical defaulting to TRUE determining whether figures are included in the RTF file (in a way that depends on the hex option). Set to FALSE to have the figure file created but not included in the report. (This might be useful if you want a figure generated by Sweave but not included as part of a report.) The include option also has functionality outside the context of figures. If a code chunk has split = TRUE then the output is written to separate files and the option include determines whether the main RTF document has “include” statements to bring in the content of these files. At this time these “include” statements are not implemented.

concordance

logical defaulting to FALSE. A value of TRUE means a concordance file is created to link the input line numbers to the output line numbers. This is an experimental feature that is part of Sweave; see the source code for the output format, which is subject to change in future releases.

figs.only

logical defaulting to FALSE. By default each figure chunk is run once, then re-run for each selected type of graphics. That will open a default graphics device for the first figure chunk and use that device for the first evaluation of all subsequent chunks. If this option is TRUE, the figure chunk is run only for each selected type of graphics, for which a new graphics device is opened and then closed.

engine

character string defaulting to "R". Only chunks with engine equal to "R" or "S" are processed by Sweave. This option can be ignored if only R code chunks are used. (However, as noted above, setting engine to anything but "R" or "S" is a way to have Sweave ignore the code chunk.)

Weaving versus tangling and naming conventions

Documents written with RTF documentation chunks can be “weaved” to form an RTF report or “tangled” to extract the code chunks into an R file (or into multiple R files depending on the split option). A separate driver is needed for these two tasks as demonstrated in the examples.

Since Sweave provides RweaveLatex and Rtangle, the respective drivers for RTF are named RweaveRtf() and RtangleRtf.

Warning

This driver will not work with documents written directly in Microsoft Word and saved as RTF prior to processing with Sweave. The problem is that what might look like a valid code chunk on the screen will, once saved as RTF, end up with random linebreaks and a good bit of RTF markup in the actual text file.

The Microsoft WordPad program writes cleaner RTF and it may be relatively straightforward to update the rtfSweave drivers to remove the hidden RTF commands that get added to a code chunk when the file is saved and then send that cleaned-up code chunk to R. In other words, it may not be possible to “sanitize” a code chunk saved in Word so that R can understand it but it should be pretty easy to sanitize a code chunk saved in WordPad.

Author(s)

Stephen Weigand Weigand.Stephen@mayo.edu adapted the code of Friedrich Leisch and others

See Also

Sweave

Examples

testfilepath <- system.file("examples", "rtfSweave-test-1.rtf",
                            package = "rtfSweave")

## 'weave' to create the document
##

Sweave(testfilepath, driver = RweaveRtf, syntax = SweaveSyntaxRtf,
       output = tempfile(fileext = ".rtf"))


## 'tangle' to extract R code chunks.
Sweave(testfilepath, driver = RtangleRtf(), syntax = SweaveSyntaxRtf,
       output = tempfile(fileext = ".R"))

       

rtfSweave documentation built on June 21, 2022, 3:01 p.m.