library("dplyr")
library("afex")
library("papaja")

knitr::opts_chunk$set(echo = TRUE)

load("../../tests/testthat/data/mixed_data.rdata")

What is papaja?

Reproducible data analysis is an easy to implement and important aspect of the strive towards reproducibility in science. For R users, R Markdown has been suggested as one possible framework for reproducible analyses. papaja is a R-package in the making including a R Markdown template that can be used with (or without) RStudio to produce documents, which conform to the American Psychological Association (APA) manuscript guidelines (6th Edition). The package uses the \LaTeX document class apa6 and a .docx-reference file, so you can create PDF documents, or Word documents if you have to. Moreover, papaja supplies R-functions that facilitate reporting results of your analyses in accordance with APA guidelines.

Markdown is a simple formatting syntax that can be used to author HTML, PDF, and MS Word documents (among others). In the following I will assume you know how to use R Markdown to conduct and comment your analyses. If this is not the case, I recommend you familiarize yourself with R Markdown first. I use RStudio to create my documents, but the general process works with any text editor.

How to use papaja

Once you have installed papaja and all other required software, you can select the APA template when creating a new R Markdown file through the RStudio menus, see Figure\ \@ref(fig:menu). When you click RStudio's Knit button (see Figure\ \@ref(fig:knit)), papaja, bookdown, rmarkdown, and knitr work together to create an APA conform manuscript that includes both your text and the output of any embedded R code chunks within the manuscript.

(ref:menu-caption) papaja's APA6 template is available through the RStudio menus.

knitr::include_graphics("../images/template_selection.png", dpi = 108)

(ref:knit-caption) The Knit button in the RStudio.

knitr::include_graphics("../images/knitting.png", dpi = 108)

Printing R output

Any output from R is included as you usually would using R Markdown. By default the R code will not be displayed in the final documents. If you wish to show off your code you need to set echo = TRUE in the chunk options. For example, to include summary statistics of your data you could use the following code:

summary(mixed_data[, -1])

But, surely, this is not what you want your submission to look like.

Print tables

For prettier tables, I suggest you try apa_table(), which builds on knitr's kable(), and apa_num(), which can be used to properly round and report numbers.

(ref:descriptives-caption) Descriptive statistics of correct recall by dosage.

(ref:descriptives-note) This table was created with apa_table().

descriptives <- mixed_data %>%
  group_by(Dosage) %>%
  summarize(
    Mean = mean(Recall)
    , Median = median(Recall)
    , SD = sd(Recall)
    , Min = min(Recall)
    , Max = max(Recall)
  )
descriptives[, -1] <- apa_num(descriptives[, -1])

apa_table(
  descriptives
  , caption = "(ref:descriptives-caption)"
  , note = "(ref:descriptives-note)"
)

Of course popular packages like xtable[^xtable] or tables can also be used to create tables when knitting PDF documents. These packages, however, cannot be used when you want to create Microsoft Word documents because they rely on \LaTeX for typesetting. apa_table() creates tables that conform to APA guidelines and are correctly rendered in PDF and Word documents. But don't get too excited; table formatting is somewhat limited for Word documents due to missing functionality in pandoc (e.g., it is not possible to have cells or headers span across multiple columns).

[^xtable]: When you use xtable(), table captions are set to the left page margin.

As required by the APA guidelines, tables are deferred to the final pages of the manuscript when creating a PDF. Again, this is not the case in Word documents due to limited pandoc functionality. To place tables and figures in your text instead, set the figsintext parameter in the YAML header to yes or true, as I have done in this document.

The bottom line is, Word documents will be less polished than PDF. The resulting documents should suffice to enable collaboration with Wordy colleagues and prepare a journal submission with limited manual labor.

Embed plots

As usual in R Markdown, you can embed R-generated plots into your document, see Figure\ \@ref(fig:beeplot).

(ref:beeplot-caption) Bee plot of the example data set. Small points represent individual observations, large points represent means, and error bars represent 95% confidence intervals.

apa_beeplot(
  mixed_data
  , id = "Subject"
  , dv = "Recall"
  , factors = c("Task", "Valence", "Dosage")
  , dispersion = conf_int
  , ylim = c(0, 30)
  , las = 1
  , args_points = list(cex = 1.5)
  , args_arrows = list(length = 0.025)
)

Again, as required by the APA guidelines, figures are deferred to the final pages of the document unless you set figsintext to yes.

Referencing figures and tables

papaja builds on the bookdown package, which provides limited cross-referencing capabilities within documents. By default you can insert figure and table numbers into the text using \@ref(fig:chunk-name) for figures or \@ref(tab:chunk-name) for tables. Note that for this syntax to work chunk names cannot include _. If you need to embed an external image that is not generated by R use the knitr::include_graphics() function. See the great book on bookdown for details. Cross-referencing is currently not available for equations in bookdown. However, as anywhere in R Markdown documents you can use \LaTeX commands if the functionality is not provided by rmarkdown/bookdown and you don't need to create Word documents.

Report statistical analyses

apa_print() will help you report the results of your statistical analyses. The function will format the contents of R objects and produce readily reportable text.

recall_anova <- afex::aov_car(
  Recall ~ (Task * Valence * Dosage) + Error(Subject/(Task * Valence)) + Dosage
  , data = mixed_data
  , type = 3
)
recall_anova_results <- apa_print(recall_anova, es = "pes")
recall_anova_results_p <- apa_print(recall_anova, es = "pes", in_paren = TRUE)

Now, you can report the results of your analyses like so:

Item valence (`r recall_anova_results_p$full$Valence`) and the task
affected recall performance, `r recall_anova_results$full$Task`; the dosage,
however, had no effect on recall, `r recall_anova_results$full$Dosage`.
There was no significant interaction.

Item valence (r recall_anova_results_p$full$Valence) and the task affected recall performance, r recall_anova_results$full$Task; the dosage, however, had no effect on recall, r recall_anova_results$full$Dosage. There was no significant interaction.

What's even more fun, you can easily create a complete ANOVA table using by passing recall_anova_results$table to apa_table(), see Table\ \@ref(tab:anova).

(ref:anova-caption) ANOVA table for the analyis of the example data set.

(ref:anova-note) This is a table created using apa_print() and apa_table().

apa_table(
  recall_anova_results$table
  , align = c("l", "r", "c", "r", "r", "r")
  , caption = "(ref:anova-caption)"
  , note = "(ref:anova-note)"
)

Citations

No manuscript is complete without citation. In order for citations to work, you need to supply a .bib-file to the bibliography parameter in the YAML front matter. Once this is done, [e.g., @james_1890; @bem_2011] produces a regular citation within parentheses [e.g., @bem_2011; @james_1890]. To cite a source in text simply omit the brackets; for example, write @james_1890 to cite @james_1890. For other options see the overview of the R Markdown citation syntax.

The citation style is automatically set to APA style. If you need to use a different citation style, you can set in the YAML front matter by providing the csl parameter. See the R Markdown documentation and Citation Style Language for further details.

If you use RStudio, I have created an easy-to-use add-in that facilitates inserting citations into a document. The relevant references will, of course, be added to the documents reference section automatically. Moreover, the addin can directly access you Zotero database.

I think it is important to credit the software we use. A lot of R packages are developed by academics free of charge. As citations are the currency of science, it's easy to compensate volunteers for their work by citing the R packages we use. I suspect that, among other things, this is rarely done because it is tedious work. That's why papaja makes citing R and its packages easy:

r_refs(file = "r-references.bib")
my_citation <- cite_r(file = "r-references.bib")

r_refs() creates a BibTeX file containing citations for R and all currently loaded packages. cite_r() takes these citations and turns them into readily reportable text. my_citation now contains the following text that you can use in your document: r my_citation

Math

If you need to report formulas, you can use the flexible \LaTeX syntax (it will work in Word documents, too). Inline math must be enclosed in $ or \( and \) and the result will look like this: $d' = z(H) - z(FA)$. For larger formulas displayed equations are more appropriate; they are enclosed in $$ or \[and \],

$$ d' = \frac{\mu_{old} - \mu_{new}}{\sqrt{0.5(\sigma^2_{old} + \sigma^2_{new})}}. $$

Document options

This text is set as manuscript. If you want a thesis-like document you can change the class in the YAML front matter from man to doc. You can also preview a polished journal typesetting by changing the class to jou. Refer to the apa6 document class documentation for further class options, such as paper size or draft watermarks.

When creating PDF documents, line numbering can be activated by setting the linenumbers argument in the YAML front matter to yes. Moreover, you can create lists of figure or table captions at the end of the document by setting figurelist or tablelist to yes, respectively. These option have no effect on Word documents.

Last words

That's all I have; enjoy writing your manuscript. If you have any trouble or ideas for improvements, open an issue on GitHub or open a pull request. If you want to contribute, take a look at the open issues if you need inspiration. Other than that, there are many output objects from analysis methods that we would like apa_print() to support. Any new S3/S4-method for this function are always appreciated (e.g., factanal, fa, lavaan, lmer, or glmer).

References

\setlength{\parindent}{-0.5in} \setlength{\leftskip}{0.5in}



crsh/papaja documentation built on Feb. 21, 2024, 6:11 p.m.