In jack-w-hill/jwh: J Hill's Personal Package

library(learnr)
library(gradethis)
library(knitr)
library(tidyverse)
knitr::opts_chunk$set(echo = FALSE)

best_tutor<-"jack"

# EVENT RECORDER TO RMD CONSOLE
# new_recorder <- function(tutorial_id, tutorial_version, user_id, event, data) {
#     cat(tutorial_id, " (", tutorial_version, "): ", user_id, ", ", event, ", ", data$label, ", ", data$answers, ", ", data$correct, "\n", sep = "")
# }
# 
# options(tutorial.event_recorder = new_recorder)

Why R?

Whenever you learn something new, you suck at it. Thankfully, that's only temporary.
You've got to commit to getting through the suckiness.

Prof. Hadley Wickham
Chief Scientist (RStudio)

R is powerful language commonly used in scientific computing.

UQ's School of Biological Sciences uses it in third-level courses and it's a popular choice amongst researchers worldwide.

R provides a start-to-finish solution: you can import and 'tidy' your data, manipulate it to add and modify variables, and then analyse and plot.

R is also fully reproducible. Journals are increasingly requiring reproducible data manipulation and analysis code alongside manuscripts.

R is the way of the future.

R also has a vibrant and supportive UQ and international community.

include_graphics("https://upload.wikimedia.org/wikipedia/commons/thumb/1/1b/R_logo.svg/724px-R_logo.svg.png")

Let's get into it!

Questions/concerns/feedback/musings to Jack Hill with the subject line:
BIOL2010 R - your_name

Back to basics

Now you're speaking my language

1+1 = 2

In R, we write code and run that code. R figures out what the code means and outputs a result.

I've given you an R "console" below. "Console" is just a fancy term for a place you can write and run code.

Let's try a simple example.

Calculate 1 plus 1 in the console below.

It's as simple as you think!

Don't forget to hit "Run Code" once you've written your code.

# I could calculate 2+2 with:

2+2

# Edit that code to add 1 and 1.

Assignment (not that kind)

When you did 1+1 in the first exercise, R showed you the output - 2.

That's okay when the output is just a number. But things are going to get more complex, so let's learn how to store the output and come back to it later.

To do this, we use <- to assign an output.

Run the code below to see what I mean.

best_tutor <- "jack"

You didn't get any output! What happened?

R has quietly stored "jack" into the best_tutor variable.

When you used <-, you told R to store the right-hand side ("jack") into the left-hand side (best_tutor).

Let's see if it worked.

Type best_tutor into the console below and run it.

Well, what do you know - it's me!

The code is correct and your opinion is correct.

A combined challenge

You now know how to do basic maths in R. And you know how to assign output to variables with <-.

Finish the code below to store the output of five squared to five_sq and then call five_sq.

five_sq <-

# you need to calculate 5 squared on the right-hand side
# of the arrow.

# so you need to code for 5 times 5!

# you can calculate 5 squared like this:

5*5 # five times five

# or like this:

5^2 # five to the power of two

# to call something, just use its name.

# like we did before for best_tutor - but this time, it'll be 
# five_sq

Nice work!

Don't forget about how the <- lets us store output - we're going to come back to it after a little statistical diversion.

t-tests

t-t-t-t-time to get statistical

Check your understanding

question("What are the two types of t-test?",
  answer("Dependent and independent",
         message = "We are never able to definitively say two groups are 'dependent'. Try again."),
  answer("Paired and unpaired", correct = TRUE),
  answer("Non-independent and indepenent",
         message = "You're correct about the groups that match each type. But the word choice is not exactly right..."),
  answer("Mr T and Mrs T",
         message = "ahem: yikes."),
  allow_retry = TRUE,
  post_message = "Paired tests tolerate non-independent groups and need equal sample sizes; unpaired tests need independent groups and tolerate mixed sample sizes."
)

question("Your hypothesis is: 'Is there a difference between group 1 and group 2'.
         <br><br>Which p-value should you interpret?",
  answer("One-tail p value",
         message = "You're asking if there is _any_ difference - you don't care about the direction of the difference. Try again."),
  answer("Either will work", message = "While they will usually tell you the same thing, it's important to be able to tell them apart. Try again."),
  answer("Two-tail p value", correct = T),
  allow_retry = TRUE,
  post_message = "We are interested in any difference, so we are going to look at both possible directions for that difference - both 'tails'."
)

Let's try it on some data

We're going to work with a dataset that comes included with R: iris.

iris is a famous set of measurements (cm) of the sepal and petal length and width for 50 flowers from each of 3 species of iris.

The species are Iris setosa, I. versicolor, and I. virginica. Here's what I. virginica looks like:

include_graphics("https://live.staticflickr.com/899/42908055162_6fc6c5ec15_b.jpg")

Gorgeous flower, right? It makes you wonder how the petals might be different between species.

Let's have a look at the data.

Run the code below to show the iris data.

iris

You can see the variables across the top and the values in rows.

We should always poke at data a bit to make sure it's in good shape before we analyse.

Check how many rows are in the iris dataset by using completing nrow() below.

nrow(...)

# you need to tell R what you want to count the rows of.

# change the dots to iris and run it again.

So we have 150 rows, as expected - 50 flowers x 3 species.

We should also check that all three species' data made it in.

To do this, we can look at the levels() of the Species variable.

Run the code below to check the Species column.

levels(iris$Species)

dat<-iris %>% 
  filter(Species!="setosa")

Nice - you should see three levels: "setosa" "versicolor" "virginica".

So we've got 150 observations, 50 for each species.

Let's see if petal length is different between species.

jack-w-hill/jwh documentation built on March 1, 2020, 12:20 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jack-w-hill/jwh
J Hill's Personal Package

In jack-w-hill/jwh: J Hill's Personal Package

Why R?

Back to basics

1+1 = 2

Assignment (not that kind)

A combined challenge

t-tests

Check your understanding

Let's try it on some data

R Package Documentation

Browse R Packages

We want your feedback!

jack-w-hill/jwh J Hill's Personal Package

In jack-w-hill/jwh: J Hill's Personal Package

Why R?

Back to basics

1+1 = 2

Assignment (not that kind)

A combined challenge

t-tests

Check your understanding

Let's try it on some data

R Package Documentation

Browse R Packages

We want your feedback!

jack-w-hill/jwh
J Hill's Personal Package