library(LM2GLMM)
set.seed(1L)
knitr::opts_chunk$set(fig.align = "center", fig.width = 4, fig.height = 4, dev.args = list(pointsize = 8))

Table of content

1. General information

2. The exam

3. A catalogue of useful {base} R commands for this course

[Back to main menu](./Title.html#2)

General information

Symbols r .emo("info")

I use emojis at the top of each slide to indicate what it refers to:


I may also use a few others within the slides

Course philosophy r .emo("info")

By the end of the course, my goal is for you to:

How to get there?

What I will not cover in this course:

What do you need for the course? r .emo("practice")

How to access the course? r .emo("practice")

The course is an R package called LM2GLMM.

To download and install this package, visit: https://github.com/courtiol/LM2GLMM/

To load the package, do:

library(package = "LM2GLMM") ## !!! Always check the version with me because I will update it often !!!
 #############################################
 #                                           #
 #    This is the package for the course     #
 #                                           #
 #    Advanced Statistical Applications:     #
 #          from LM to GLMM using R          #
 #                                           #
 #       Version XX.XX.XX.X installed!       #
 #                                           #
 #    To access the slides, type either      #
 #   browseVignettes(package = 'LM2GLMM')    #
 #                    or                     #
 #             get_vignettes()               #
 #                                           #
 #############################################

The exam

Procedure r .emo("info")

What?

When?

(It should be sufficient, but you will be able to stay longer if you need to!)

Condition?

Grading r .emo("info")

I will grade:

I will give bonuses for the quality/elegance of:

I will help you:

A catalogue of useful {base} R commands for this course

Vectors: basics r .emo("practice")

foo <- c(1, 5, 5, 10, 23, NA, NA)
foo[2]
foo > 3
which(foo > 3) ## returns indexes
foo[which(foo > 3)] ## foo[foo > 3] would return the same + NAs

Vectors: basics r .emo("practice")

foo2 <- c(a = 1, b = 5, c = 10, d = 23)
foo2["b"]
foo2[c("b", "a")]
foo2[-3] ## remove one item from the vector (same result as foo2[names(foo2) != "c"])

Vectors: some useful functions r .emo("practice")

length(foo)
summary(foo)
any(is.na(foo))
table(foo, useNA = "always")

Vectors: classes r .emo("practice")

foo1 <- c(1, 5, 5, NA)
foo2 <- c(a = 1, b = 3)
foo3 <- c(TRUE, FALSE)
foo4 <- c("a", "b", "c")
foo5 <- factor(foo3)
foo6 <- c(1L, 3L, 5L)
class(foo1)
classes <- unlist(lapply(as.list(names <- paste0("foo", 1:6)), function(x) class(get(x))))
knitr::kable(rbind(names, classes))

Vectors: factors r .emo("practice")

(foo <-  factor(c("a", "b", "c"))) ## Tip: the brackets force the display
relevel(foo, ref = "c")                      ## change first level
factor(foo, levels = c("c", "b", "a"))       ## control order of levels
droplevels(foo[foo != "b"])                  ## drop empty levels
c(factor(c("a", "b")), factor(c("b", "c")))  ## R >= 4.1 can coerce factors with different levels
as.character(foo)                            ## you can turn factors into characters

Vectors: sequences r .emo("practice")

1:5
rep(1:5, each = 2)
rep(1:5, times = 2)
seq(from = 1, to = 5, by = 0.5) ## more arguments possible

Vectors: pseudo-random numbers r .emo("practice")

rnorm(5, mean = 2, sd = 0.1)  ## Tip: check ?Distributions for other distributions
rnorm(5, mean = 2, sd = 0.1)
set.seed(14353L) # fixing the seed allows for you to get reproducible results
rnorm(5, mean = 2, sd = 0.1)
set.seed(14353L)
rnorm(5, mean = 2, sd = 0.1)

Logical operators r .emo("practice")

Return TRUE or FALSE (or NA)

x == y ## is equal to
x != y ## is not equal to
x < y  ## less than
x <= y ## less than or equal to
x > y  ## greater than
x >= y ## greater than or equal to
## compare every item in vector to the same object
foo  <- c(1, 2, 3, NA)
foo <= 2 

## compare each set of items in the two vectors
foo2 <- c(0, 3, 2, NA)
foo < foo2 
## combine multiple operators 
foo > 0 & foo < 3

as.numeric(c(TRUE, FALSE)) ## TRUE = 1, FALSE = 0
foo3 <- c("a", "b", "a", "a")
sum(foo3 == "a") ## number of items that equal a

Attributes r .emo("practice")

foo <- c(a = 1, b = 5, c = 10, d = 23)
attributes(foo)
attr(foo, "names")  ## Tip: here a shortcut would be 'names(foo)', but this is not general
foo <-  factor(c("a", "b", "c"))
attributes(foo)

Dataframes: basics r .emo("practice")

foo <- data.frame(
  x = c(1, 3, 5),
  z = factor(c("a", "b", "c"))
  )
foo
dim(foo) ## Tip: try nrow() and ncol()
dimnames(foo) ## Tip: try rownames() and colnames()

Dataframes: indexing r .emo("practice")

foo[2, ]    ## get row 2
foo[, 2]    ## get column 2
foo$x       ## get column x
foo[, "x"]  ## get column x

Dataframes: combining rows r .emo("practice")

foo2 <- data.frame(
  x = c(2, 4),
  z = factor(c("b", "d"))
  )
rbind(foo, foo2)

Dataframes: combining columns r .emo("practice")

bar <- data.frame(
  x = c(1, 2, 3),
  w = c(2, 4, 3),
  s = factor(c("b", "d", "e"))
  )
cbind(foo, bar) ## Note that columns are not merged

Dataframes: {dplyr} <---> {base} translation r .emo("info")

library(dplyr)
foo_tbl <- tibble(x = c(1, 3, 5),
                  z = factor(c("a", "b", "c")))
foo_tbl ## the display is different
select(foo_tbl, x)          ## same as foo_tbl[, "x", drop = FALSE]
slice(foo_tbl, 2)           ## same as foo_tbl[2, ]
filter(foo_tbl, x == 3)     ## same as foo_tbl[which(foo_tbl$x == 3), ]
foo_tbl %>% select(x)       ## same as foo_tbl[, "x", drop = FALSE]
foo_tbl %>% slice(2)        ## same as foo_tbl[2, ]
foo_tbl %>% filter(x == 3)  ## same as foo_tbl[which(foo_tbl$x == 3), ]

Dataframes: applying functions on rows or columns r .emo("practice")

(d <- data.frame(a = c(1, 2, 3, NA), b = c(NA, 3, 2, 1)))
apply(d, 1, function(x) any(is.na(x))) ## NA in each row
apply(d, 2, function(x) any(is.na(x)))  ## NA in each column

Matrices: basics r .emo("practice")

foo <- matrix(data = 1:4, nrow = 2, ncol = 2) ## Tip: try with byrow = TRUE
colnames(foo) <- c("a", "b"); rownames(foo) <- c("A", "B")
foo

Indexing similar to dataframes (but you cannot use $):

foo[, 2]
foo[2, ]
foo[1, 2]

Matrices: transposition, inversion r .emo("practice")

foo
t(foo)  ## transpose of foo
solve(foo)  ## inverse of foo

Matrices: multiplication r .emo("practice")

foo %*% matrix(c(-1, 1)) ## = [1 * -1 + 3 * 1][2 * -1 + 4 * 1]
foo %*% solve(foo)       ## a matrix times its inverse equals the identity matrix
foo * solve(foo)         ## NOT MATRIX MULTIPLICATION! (element by element multiplication)

Lists r .emo("practice")

foo <- list("foo1" = c(1:10), "foo2" = factor(c("a", "b")))
foo
foo[["foo2"]]
foo$foo2

Functions r .emo("practice")

addA_B <- function(a = 0, b = 0) { ## setting values for argument makes them default values
  c <- a + b
  return(c)
  }
addA_B(a = 5, b = 7)
addA_B(a = 5)
addA_B_bis <- function(a, b) a + b
addA_B_bis(2, 3)

Repeated operations r .emo("practice")

hello <- function(who = "alex") paste("hello", who)
replicate(10, hello())  ## returns a vector when it can
lapply(c("alex", "olivia"), function(i) hello(who = i)) ## returns a list
sapply(c("alex", "olivia"), function(i) hello(who = i)) ## returns a (named) vector when it can

Plots r .emo("info")

Plots are necessary to check data (e.g. outliers), model assumptions, and communicate model results.

Let's use the Davis dataset which contains both quantitative (e.g. weight, height) and qualitative (e.g. sex) variables:

str(Davis)

Note: check ?Davis for details on each variable.

Plots: the scatterplot r .emo("practice")

For plotting two quantitative variables (work best if continuous, messy if discrete):

plot(repht ~ height, data = Davis)

Note: if one variable is expected to affect the other it should be on the x-axis.

Plots: the scatterplot r .emo("practice")

For more advanced scatterplots, you may use the function scatterplot() from the package {car}:

library(car)
scatterplot(height ~ weight + sex, data = Davis[-12, ])

Plots: pairs() r .emo("practice")

Plot all variables against each others using scatterplots:

pairs(Davis)

Note: qualitative variables are turned into quantitative ones.

Plots: the boxplot r .emo("practice")

For plotting the distribution of a quantitative variable (best if continuous) against a qualitative variable:

## Remove outlier identified in scatterplot to see boxplots better (for demonstration only)
plot_data <- Davis[which(Davis$height > 100), ]
boxplot(height ~ sex, data = plot_data)

Note: boxplots show median, inter-quartile range, and potential outliers.

Plots: the conditioning plot r .emo("practice")

Generate multiple scatterplots conditional on one other variable:

## Relationship between height and reported height for each sex
coplot(repht ~ height | sex, data = plot_data, panel = panel.smooth)

Plots: the conditioning plot r .emo("practice")

Generate multiple scatterplots conditional on two other variables:

## Relationship between height and reported height for different weight groups
coplot(repht ~ height | weight * sex, data = plot_data, panel = panel.smooth)

Note: trellis plots in margins indicate how to match each scatterplot to the weight and the sex.

[Exercises](./Introduction_exercises.html)


[Back to main menu](./Title.html#2)


courtiol/LM2GLMM documentation built on July 3, 2022, 7:42 a.m.