knitr::opts_chunk$set(echo = TRUE, warning=FALSE)
suppressWarnings(suppressMessages(suppressPackageStartupMessages(library(ggplot2))))

An example application

This example is adapted from www.r-tutor.com.

The binomial distribution is a discrete probability distribution. It describes the outcome of n independent trials in an experiment. Each trial is assumed to have only two outcomes, either success or failure. If the probability of a successful trial is p, then the probability of having x successful outcomes in an experiment of n independent trials given by.

$$f(x) = \dbinom{n}{x} \ p^x (1-p)^{(n-x)} \ where \ x=1,2,\ldots,n$$

Problem

Suppose there are twelve multiple choice questions in an English class quiz. Each question has five possible answers, and only one of them is correct. Find the probability of having four or less correct answers if a student attempts to answer every question at random.

Solution

Since only one out of five possible answers is correct, the probability of answering a question correctly by random is $1/5 = 0.2$. We can use R to find the probability of having exactly 4 correct answers by random attempts as follows.

ans <- dbinom(4, size=12, prob=0.2)
print(ans)

To find the probability of having four or less correct answers by random attempts, we can sequentially apply the function dbinom with $x=0,1,\ldots,4$.

ans <-  dbinom(0, size=12, prob=0.2) + 
        dbinom(1, size=12, prob=0.2) + 
        dbinom(2, size=12, prob=0.2) + 
        dbinom(3, size=12, prob=0.2) + 
        dbinom(4, size=12, prob=0.2)
print(ans)

Alternatively, we can use the cumulative probability function for binomial distribution, pbinom.

ans <- pbinom(4, size=12, prob=0.2)
print(ans)

Answer

The probability of four or less questions answered correctly by random in a twelve question multiple choice quiz is r sprintf("%.1f", round(100*ans, 1))%.

Plotting the binomial distribution

This example is adapted from stackoverflow.

The problem

What is the probability of having at least 16 successes out of 20 operations given that the probability of success is 0.8?

The solution

We will use ggplot2 to create the basic plot

require(ggplot2)

df <- data.frame(x=1:20, prob=dbinom(1:20, 20, prob=0.8))
plt <-  ggplot(data = df, aes(x = x, y = prob)) + geom_line() +
        geom_ribbon(data = subset(df, x >= 16 & x <= 20),
        aes(ymax = prob), ymin=0, fill = "red",
            colour = NA, alpha = 0.5)
print(plt)

We can make the plot more readable with better axis labels.

plt  <-  plt +
         xlab("Number of successful operations") +
         ylab("Probability") +
         ggtitle("Probability of 16/20 successful operations") +
         theme(axis.text=element_text(size=12),
               axis.title=element_text(size=12),
               plot.title=element_text(hjust = 0.5))
print(plt)


jrminter/statshelpR documentation built on May 2, 2020, 12:08 a.m.