library(ggplot2)
R Markdown provides an unified authoring framework for data science, combining your code, its results, and your prose commentary. R Markdown documents are fully reproducible and support dozens of static and dynamic output formats like PDFs, Word files, slideshows, and more.. For a comprehensive resource on R Markdown see R Markdown: The Definitive Guide
R Markdown integrates a number of R packages and external tools. This means that help is, by-and-large, not available through ?. Instead, as you work through this chapter, and use R Markdown in the future, keep these resources close to hand:
Both cheatsheets are also available at http://rstudio.com/cheatsheets.
The real point of R Markdown is that it lets you include your code, have the code run automatically when your document is rendered, and seamlessly include the results of that code in your document.
While there is an "R" in R Markdown this is not limited to just the R programming language. It is a generic markup language that can knit any code that can be run from a command line into a document. Some of the more popular languages are:
You can intermix languages within the same document, and in some cases pass data between languages.
```{block2, type='rmdimportant'} You do not have to include any programming language. There are many books, websites, and wikis written in R Markown which contain no code.
## Markdown Basics Format the text in your R Markdown file with [Pandoc’s Markdown](https://pandoc.org/MANUAL.html#pandocs-markdown), a set of markup annotations for plain text files. When you render your file, Pandoc transforms the marked up text into formatted text in your final file format. Notice that the file contains three types of content: 1. Text mixed with simple text formatting. 1. Code chunks and the corresponding output. 1. An (optional) YAML header controlling the "whole document" settings. ### Headers The character `#` at the beginning of a line means that the rest of the line is interpreted as a section header. The number of `#`s at the beginning of the line indicates whether it is treated as a section, sub-section, sub-sub-section, etc. of the document.
In this document the chapter title **R Markdown** is preceded by a single `#`, but **Markdown Basics** at the start of this paragraph was preceded by `##` and the current section **Headers** is preceded by `###` in the text file. ### Paragraph Breaks and Forced Line Breaks
To insert a break between paragraphs, include a single completely blank line.
To force a line break, put two blank
spaces at the end of a line.
To insert a break between paragraphs, include a single completely blank line. To force a line break, put two blank spaces at the end of a line.
If you don't put the two blank spaces at the end of a line they will run together.
If you don't put the two blank spaces at the end of a line they will run together. ### Italics and Boldface
Text to be italicized goes inside a single set of underscores or asterisks. Text to be boldfaced goes inside a double set of underscores or asterisks.
Text to be _italicized_ goes inside _a single set of underscores_ or *asterisks*. Text to be **boldfaced** goes inside a __double set of underscores__ or **asterisks**. ### Bullet Points
*
(asterisk) character, or a single dash (-
) and then have a space.+
for sub-bullets.* This is a list marked where items are marked with bullet points. * Each item in the list should start with a `*` (asterisk) character, or a single dash (`-`) and then have a space. * Each item should also be on a new line. + Indent lines with 4 spaces and begin them with `+` for sub-bullets. + Sub-sub-bullet are not’t really a thing in R Markdown. ### Numbered Lists
1. Lines which begin with a numeral (0–9), followed by a period, will usually be interpreted as items in a numbered list. 1. R Markdown handles the numbering in what it renders automatically, so the actual number doesn't matter. 1. This can be handy when you lose count or don’t update the numbers yourself when editing. (Look carefully at the .Rmd file for this item.) a. Sub-lists of numbered lists, with letters for sub-items, are a thing. b. They are however a fragile thing, which you’d better not push too hard. ## Math R Markdown uses standard LaTeX to render complex mathematical formulas and derivations, and have them displayed very nicely. Like code, the math can either be inline or set off (displays). Inline math is marked off with a pair of dollar signs (`$`), `$\frac{a+b}{b} = 1 + \frac{a}{b}$` $\frac{a+b}{b} = 1 + \frac{a}{b}$ Mathematical displays are marked off with `\[` and` \]`, as in
[ \frac{a+b}{b} = 1 + \frac{a}{b} ]
\[ \frac{a+b}{b} = 1 + \frac{a}{b} \] ## Code Chunks Think of a chunk like a function. A chunk should be relatively self-contained, and focused around a single task. You can quickly insert chunks like these into your file with * the keyboard shortcut **Ctrl + Alt + I** (OS X: **Cmd + Option + I**) * typing the chunk delimiters ` ```r ` and ` ``` `. * the *Add Chunk* command in the editor toolbar When you render your .Rmd file, R Markdown will run each code chunk and embed the results beneath the code chunk in your final report. There are three main sections to a the code chunk header: 1. the programming language engine to run the code 2. the code chunk name (optional but very useful) 3. the code chunk options. ### Chunk Names Chunks can be given an optional name: ```` ```r ````. This has three advantages: 1. You can more easily navigate to specific chunks using the drop-down code navigator in the bottom-left of the script editor: 1. Graphics produced by the chunks will have useful names that make them easier to use elsewhere. 1. You can set up networks of cached chunks to avoid re-performing expensive computations on every run. More on that below. It’s a good idea to name code chunks that produce figures, even if you don’t routinely label other chunks. The chunk label is used to generate the file name of the graphic on disk, so naming your chunks makes it much easier to pick out plots and reuse in other circumstances (i.e. if you want to quickly drop a single plot into an email) ```{block2, type='rmdimportant'} There is one chunk name wit special behaviour: `setup`. In notebook mode, the chunk named `setup` will be run automatically once, before any other code is run.
Chunk output can be customized with options, arguments supplied to chunk header. Knitr provides almost 60 options that you can use to customize your code chunks. Here we'll cover the most important chunk options that you'll use frequently. You can see the full list at http://yihui.name/knitr/options/.
The most important set of options controls if your code block is executed and what results are inserted in the finished report:
include = FALSE
prevents code and results from appearing in the finished file. Use this to run code and results used by other chunks (i.e. setup code).
echo = FALSE
prevents code, but not the results from appearing in the
finished file. Use this when writing reports.
message = FALSE
or warning = FALSE
prevents messages or warnings
from appearing in the finished file.
results = 'hide'
hides printed output; fig.show = 'hide'
hides
plots.
error = TRUE
causes the render to continue even if code returns an error. Use this for debugging.
To set global options that apply to every chunk in your file, call knitr::opts_chunk$set
in a code chunk. Knitr will treat each option that you pass to knitr::opts_chunk$set
as a global default that can be overwritten in individual chunk headers If you are doing a report where you would never show code include knitr::opts_chunk$set(echo = FALSE)
in the setup
code block.
\```r knitr::opts_chunk$set(echo = FALSE) \```
Code output can also be seamlessly incorporated into the text, using inline code. This is code not set off on a line by itself, but beginning with `r
and ending with `
. Using inline code is how this document knows that the mtcars data set contains r nrow(mtcars)
rows, and that the mean mpg of the 6 cylinder cars is r round(mean(mtcars[mtcars$cyl == 6, "mpg"]), 1)
.
R Markdown will always
As a result, inline output is indistinguishable from the surrounding text. Inline expressions do not take knitr options.
Inserting a graph into RMarkdown is easy, the more energy-demanding aspect might be adjusting the formatting.
\```r ggplot(data = mpg, mapping = aes(x = class, y = hwy)) + geom_boxplot() + coord_flip() \```
ggplot(data = mpg, mapping = aes(x = class, y = hwy)) + geom_boxplot() + coord_flip()
By default, RMarkdown will place graphs by maximizing their height, while keeping them within the margins of the page and maintaining aspect ratio. If you have a particularly tall figure, this can mean a really huge graph. To manually set the figure dimensions in the code chunk options use fig.height
and fig.width
.
You can also set the alignment with fig.align
and either left
, right
or center
. Also, by default the file type of the plot is determined by the output file type. You can override these defaults by using the dev
option in the code chunk options. See R Markdown: The Definitive Guide for more details.
By default, R Markdown displays data frames and matrices as they would be in the R terminal (in a monospaced font). This generally is not what you want. If you prefer that data be displayed with additional formatting there are a number of packages that can make publication ready tables.
Which package you should use depends on what your output format is and which features you want. All these packages have vignettes which show how to use the major features.
library(kableExtra) dt <- mtcars[1:5, 1:6] kable(dt, caption = "kable table", booktabs = TRUE)
kable(mtcars, longtable = TRUE, booktabs = TRUE, caption = "Kable Longtable") %>% add_header_above(c(" ","Group 1" = 5, "Group 2" = 6)) %>% kable_styling(latex_options = c("repeat_header"))
At the top of any RMarkdown script is a YAML header section enclosed by ---
. By default this includes a title, author, date and the file type you want to output to. Many other options are available for different functions and formatting, see here for .html options and here for .pdf options. Rules in the header section will alter the whole document. Have a flick through quickly to familiarize yourself with the sorts of things you can alter by adding an option to the YAML header.
--- title: "My Report" output: pdf_document: toc: true toc_depth: 2 number_sections: TRUE ---
One of the many benefits of working with R Markdown is that you can reproduce analysis at the click of a button. This makes it very easy to update any work and alter any input parameters within the report. Parameterized reports extend this one step further, and allow users to specify one or more parameters to customize the analysis. This is useful if you want to create a report template that can be reused across multiple similar scenarios.
Declaring parameters
Parameters are specified using the params field within the YAML section. We can specify one or more parameters with each item on a new line. As an example:
--- title: "My Report" output: pdf_document params: year: 2018 region: Europe data: file.csv ---
All standard R types that can be included as parameters, including character
, numeric
, integer
, and logical
types. We can also use R objects by including !r
before R expressions. For example, we could include the current date with the following R code:
--- title: "My Report" output: html_document params: date: !r Sys.Date() ---
Using parameters
You can access the parameters within the knitting environment and the R console in RStudio. The values are contained within a read-only list called params
. In the previous example, the parameters can be accessed as follows:
params$year params$region mydata <- read_csv(params$data)
Knitting with parameters
render_report = function(file, region, year) { rmarkdown::render( "MyDocument.Rmd", params = list( region = region, year = year, data = file ), output_file = paste0("Report-", region, "-", year, ".pdf") ) }
There are many types of output with more being added every day. lists just the ones
pdf_document
makes a PDF with LaTeXword_document
for Microsoft Word documents (.docx
).odt_document
for OpenDocument Text documents (.odt
).rtf_document
for Rich Text Format (.rtf
) documents.md_document
for a Markdown document. github_document
: this is a tailored version of md_document
designed for sharing on GitHub.ioslides_presentation
- HTML presentation with ioslidesslidy_presentation
- HTML presentation with W3C Slidybeamer_presentation
- PDF presentation with LaTeX Beamer.revealjs::revealjs_presentation
- HTML presentation with reveal.js.powerpoint_presentation
- Create MS PowerPoint presentations.html_notebook
is a variation on a html_document
.flexdashboard::flex_dashboard
, http://rmarkdown.rstudio.com/flexdashboard/ provides simple tools for creating sidebars, tabsets, value boxes, and gauges.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.