library(LM2GLMM) set.seed(1L) knitr::opts_chunk$set(fig.align = "center", fig.width = 4, fig.height = 4, dev.args = list(pointsize = 8))
r .emo("info")
I use emojis at the top of each slide to indicate what it refers to:
r .emo("info")
for things that are good to know to improve your understandingr .emo("practice")
for practical advice (i.e. things you need to learn by doing)r .emo("nerd")
for nerdy details (i.e. things you don't need to learn, but will deepen your understanding)r .emo("proof")
for demonstrations (i.e. to show you I am not always making things up and teach you how to test your own ideas)r .emo("alien")
for creation of data (i.e. the R objects we will use in other slides)r .emo("goal")
for goals (i.e. highlight the key things you need to remember)I may also use a few others within the slides
r .emo("party")
for solutions or nice resultsr .emo("broken")
for things that are brokenr .emo("slow")
for heavy computationsr .emo("warn")
for warningsr .emo("recap")
recap something we covered previouslyr .emo("info")
r .emo("practice")
r .emo("practice")
The course is an R package called LM2GLMM.
To download and install this package, visit: https://github.com/courtiol/LM2GLMM/
To load the package, do:
library(package = "LM2GLMM") ## !!! Always check the version with me because I will update it often !!!
############################################# # # # This is the package for the course # # # # Advanced Statistical Applications: # # from LM to GLMM using R # # # # Version XX.XX.XX.X installed! # # # # To access the slides, type either # # browseVignettes(package = 'LM2GLMM') # # or # # get_vignettes() # # # #############################################
r .emo("info")
(It should be sufficient, but you will be able to stay longer if you need to!)
r .emo("info")
r .emo("practice")
foo <- c(1, 5, 5, 10, 23, NA, NA) foo[2] foo > 3 which(foo > 3) ## returns indexes foo[which(foo > 3)] ## foo[foo > 3] would return the same + NAs
r .emo("practice")
foo2 <- c(a = 1, b = 5, c = 10, d = 23) foo2["b"] foo2[c("b", "a")]
foo2[-3] ## remove one item from the vector (same result as foo2[names(foo2) != "c"])
r .emo("practice")
length(foo) summary(foo) any(is.na(foo)) table(foo, useNA = "always")
r .emo("practice")
foo1 <- c(1, 5, 5, NA) foo2 <- c(a = 1, b = 3) foo3 <- c(TRUE, FALSE) foo4 <- c("a", "b", "c") foo5 <- factor(foo3) foo6 <- c(1L, 3L, 5L)
class(foo1)
classes <- unlist(lapply(as.list(names <- paste0("foo", 1:6)), function(x) class(get(x)))) knitr::kable(rbind(names, classes))
r .emo("practice")
(foo <- factor(c("a", "b", "c"))) ## Tip: the brackets force the display
relevel(foo, ref = "c") ## change first level factor(foo, levels = c("c", "b", "a")) ## control order of levels droplevels(foo[foo != "b"]) ## drop empty levels c(factor(c("a", "b")), factor(c("b", "c"))) ## R >= 4.1 can coerce factors with different levels as.character(foo) ## you can turn factors into characters
r .emo("practice")
1:5 rep(1:5, each = 2) rep(1:5, times = 2) seq(from = 1, to = 5, by = 0.5) ## more arguments possible
r .emo("practice")
rnorm(5, mean = 2, sd = 0.1) ## Tip: check ?Distributions for other distributions rnorm(5, mean = 2, sd = 0.1) set.seed(14353L) # fixing the seed allows for you to get reproducible results rnorm(5, mean = 2, sd = 0.1) set.seed(14353L) rnorm(5, mean = 2, sd = 0.1)
r .emo("practice")
Return TRUE or FALSE (or NA)
x == y ## is equal to x != y ## is not equal to x < y ## less than x <= y ## less than or equal to x > y ## greater than x >= y ## greater than or equal to
## compare every item in vector to the same object foo <- c(1, 2, 3, NA) foo <= 2
## compare each set of items in the two vectors foo2 <- c(0, 3, 2, NA) foo < foo2
## combine multiple operators foo > 0 & foo < 3
as.numeric(c(TRUE, FALSE)) ## TRUE = 1, FALSE = 0
foo3 <- c("a", "b", "a", "a") sum(foo3 == "a") ## number of items that equal a
r .emo("practice")
foo <- c(a = 1, b = 5, c = 10, d = 23) attributes(foo) attr(foo, "names") ## Tip: here a shortcut would be 'names(foo)', but this is not general foo <- factor(c("a", "b", "c")) attributes(foo)
r .emo("practice")
foo <- data.frame( x = c(1, 3, 5), z = factor(c("a", "b", "c")) ) foo dim(foo) ## Tip: try nrow() and ncol() dimnames(foo) ## Tip: try rownames() and colnames()
r .emo("practice")
foo[2, ] ## get row 2 foo[, 2] ## get column 2 foo$x ## get column x foo[, "x"] ## get column x
r .emo("practice")
foo2 <- data.frame( x = c(2, 4), z = factor(c("b", "d")) )
rbind(foo, foo2)
r .emo("practice")
bar <- data.frame( x = c(1, 2, 3), w = c(2, 4, 3), s = factor(c("b", "d", "e")) )
cbind(foo, bar) ## Note that columns are not merged
r .emo("info")
library(dplyr)
foo_tbl <- tibble(x = c(1, 3, 5), z = factor(c("a", "b", "c"))) foo_tbl ## the display is different
select(foo_tbl, x) ## same as foo_tbl[, "x", drop = FALSE] slice(foo_tbl, 2) ## same as foo_tbl[2, ] filter(foo_tbl, x == 3) ## same as foo_tbl[which(foo_tbl$x == 3), ]
foo_tbl %>% select(x) ## same as foo_tbl[, "x", drop = FALSE] foo_tbl %>% slice(2) ## same as foo_tbl[2, ] foo_tbl %>% filter(x == 3) ## same as foo_tbl[which(foo_tbl$x == 3), ]
r .emo("practice")
(d <- data.frame(a = c(1, 2, 3, NA), b = c(NA, 3, 2, 1)))
apply(d, 1, function(x) any(is.na(x))) ## NA in each row
apply(d, 2, function(x) any(is.na(x))) ## NA in each column
r .emo("practice")
foo <- matrix(data = 1:4, nrow = 2, ncol = 2) ## Tip: try with byrow = TRUE colnames(foo) <- c("a", "b"); rownames(foo) <- c("A", "B") foo
Indexing similar to dataframes (but you cannot use $):
foo[, 2]
foo[2, ]
foo[1, 2]
r .emo("practice")
foo t(foo) ## transpose of foo solve(foo) ## inverse of foo
r .emo("practice")
foo %*% matrix(c(-1, 1)) ## = [1 * -1 + 3 * 1][2 * -1 + 4 * 1] foo %*% solve(foo) ## a matrix times its inverse equals the identity matrix foo * solve(foo) ## NOT MATRIX MULTIPLICATION! (element by element multiplication)
r .emo("practice")
foo <- list("foo1" = c(1:10), "foo2" = factor(c("a", "b"))) foo foo[["foo2"]] foo$foo2
r .emo("practice")
addA_B <- function(a = 0, b = 0) { ## setting values for argument makes them default values c <- a + b return(c) }
addA_B(a = 5, b = 7) addA_B(a = 5)
addA_B_bis <- function(a, b) a + b addA_B_bis(2, 3)
r .emo("practice")
hello <- function(who = "alex") paste("hello", who)
replicate(10, hello()) ## returns a vector when it can
lapply(c("alex", "olivia"), function(i) hello(who = i)) ## returns a list
sapply(c("alex", "olivia"), function(i) hello(who = i)) ## returns a (named) vector when it can
r .emo("info")
Plots are necessary to check data (e.g. outliers), model assumptions, and communicate model results.
Let's use the Davis
dataset which contains both quantitative (e.g. weight, height) and qualitative (e.g. sex) variables:
str(Davis)
Note: check ?Davis
for details on each variable.
r .emo("practice")
For plotting two quantitative variables (work best if continuous, messy if discrete):
plot(repht ~ height, data = Davis)
Note: if one variable is expected to affect the other it should be on the x-axis.
r .emo("practice")
For more advanced scatterplots, you may use the function scatterplot()
from the package {car}:
library(car) scatterplot(height ~ weight + sex, data = Davis[-12, ])
pairs()
r .emo("practice")
Plot all variables against each others using scatterplots:
pairs(Davis)
Note: qualitative variables are turned into quantitative ones.
r .emo("practice")
For plotting the distribution of a quantitative variable (best if continuous) against a qualitative variable:
## Remove outlier identified in scatterplot to see boxplots better (for demonstration only) plot_data <- Davis[which(Davis$height > 100), ] boxplot(height ~ sex, data = plot_data)
Note: boxplots show median, inter-quartile range, and potential outliers.
r .emo("practice")
Generate multiple scatterplots conditional on one other variable:
## Relationship between height and reported height for each sex coplot(repht ~ height | sex, data = plot_data, panel = panel.smooth)
r .emo("practice")
Generate multiple scatterplots conditional on two other variables:
## Relationship between height and reported height for different weight groups coplot(repht ~ height | weight * sex, data = plot_data, panel = panel.smooth)
Note: trellis plots in margins indicate how to match each scatterplot to the weight and the sex.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.