Course R package

Installing the course R package is straightforward. First install drat, a package that makes it easy to host and distribute packages.

install.packages("drat")

\noindent Then

drat::addRepo("jr-packages")
install.packages("jrPred")

\noindent This R package contains copies of the practicals, solutions and data sets that we require. It will also automatically install any packages, that we use during the course. For example, we will need the caret, mlbench, pROC and splines to name a few. To load the course package, use

library("jrPred")

\noindent During this practical we will mainly use the caret package, we should load that package as well

library("caret")

The cars2010 data set

The cars2010 data set contains information about car models in $2010$. The aim is to model the FE variable which is a fuel economy measure based on $13$ predictors. Further information can be found in the help page, help("cars2010", package = "AppliedPredictiveModeling").

The data is part of the AppliedPredictiveModeling package and can be loaded by

data(FuelEconomy, package = "AppliedPredictiveModeling")

\noindent There are a lot of questions below marked out by bullet points. Don't worry if you can't finish them all, the intention is that there is material for different backgrounds and levels

An Initial Model

m1 = train(FE ~ EngDispl, method = "lm", data = cars2010)
predict(m1, newdata = data.frame(EngDispl = 7))
sqrt(mean(resid(m1)^2))
# or 
RMSE(fitted.values(m1), cars2010$FE)

Extending the model

m2 = train(FE ~ poly(EngDispl, 2, raw = TRUE), data = cars2010,
    method = "lm")
sqrt(mean(resid(m2)^2)) - sqrt(mean(resid(m1)^2))
# Yes
m3 = train(FE ~ EngDispl + NumCyl, data = cars2010, method = "lm")
sqrt(mean(resid(m3)^2))

Visualising the models

plot(cars2010$EngDispl, cars2010$FE)
abline(m1$finalModel, col = 2)
x_values = seq(1,8.4,0.1)
new_pred_values = predict(m2, newdata = data.frame(EngDispl = x_values)
lines(x = x_values, y = new_pred_values, col = 3)
# Yes, line looks to curve with the data now we have added a quadratic term
## points = TRUE to also show the points
plot3d(m3, cars2010$EngDispl, cars2010$NumCyl, cars2010$FE,
    points = FALSE)

\noindent We can also examine just the data interactively, via

threejs::scatterplot3js(cars2010$EngDispl, cars2010$NumCyl,
    cars2010$FE, size = 0.5)


jr-packages/jrPred documentation built on May 6, 2019, 7:17 a.m.