## load additional packages in this chunk
library(reportMD)
library(knitr)
library(pander)
library(ggplot2)
library(plotly)
## This chunk should contain global configuration commands.
## Use this to set knitr options and related things. Everything
## in this chunk will be included in an appendix to document the
## configuration used.

## Pander options
panderOptions("digits", 3)
panderOptions("table.split.table", 160)
## Custom functions used in the analysis should go into this chunk.
## They will be listed in their own section of the appendix.

Analysis

This is the second part of the multi-part data analysis document. Here we'll use the pre-processed dataset produced in [Part 1][part1] for a basic analysis.

box_fig <- ggplot(ext_cars, aes(x=origin, y=mpg)) + geom_boxplot() + theme_bw()

An initial look at the data suggests that there is a pronounced difference in the miles per gallon achieved by US and international cars (r figRef("boxplot")).

plotMD(box_fig)

A reasonable initial hypothesis for why this might be the case is that US built cars may tend to be larger (and therefore heavier) and have larger engines. As it turns out car weight and engine displacement are highly correlated ($\rho$ = r cor(mtcars$disp, mtcars$wt)), so we'll investigate this by fitting a linear regression model with weight as explanatory variable.

fit <- lm(mpg ~ wt, data=ext_cars)
ext_cars$fitted <- fit$fitted.values

This appears to explain the difference in MPG reasonably well, although it is clearly not the only relevant factor:

pander::set.caption("")
printMD(fit)

We'll also prepare a plot visualising the result for inclusion in the main document.

reg_fig <- ggplot(ext_cars, aes(x=wt, y=mpg, colour=origin, text=rownames(ext_cars))) + geom_point(size=2) +
  geom_abline(slope=fit$coefficients['wt'], intercept=fit$coefficients[1], linetype='dashed', colour='darkgrey') +
  geom_segment(aes(yend=fitted, xend=(fitted-fit$coefficients[1])/fit$coefficients['wt'])) +
  theme_bw()



humburg/reportmd documentation built on May 17, 2019, 9:13 p.m.