library(learnr)
library(gradethis)

tutorial_options(exercise.timelimit = 5, exercise.checker = gradethis::grade_learnr)

knitr::opts_chunk$set(echo = FALSE)

Maximum Likelihood

In this activity, we'll be estimating parameters through maximimum likelihood using the optimization technique called gradient descent. To begin, let's take a look at the likelihood function for the Poisson distribution.

Poisson Distribution

[ L(\lambda ; y_1,\ldots,y_n) = \prod_{i=1}^n \frac{\lambda^{y_i}\exp(-\lambda)}{y_i!}. ]

Begin by generating $n=100$ observations from a Poisson distribution with parameter $\lambda=3$. You can do this by using the function rpois(n,lambda)

set.seed(07072020)
y = rpois(n=___,lambda=___)
y
set.seed(07072020)
sol <- rpois(100,3)

set.seed(07072020)
wrong <- rpois(3,100)

grade_result(
  pass_if(~ identical(.result, sol), "Correct!"),
  fail_if(~ identical(.result, wrong ), "Incorrect.  Did you flip n and lambda?"),
  fail_if(~ TRUE, "Incorrect.")
)

Maximum Likelihood

Let's begin by reviewing maximum likelihood.

quiz(caption="Question 1",
  question("What is the goal of maximum likelihood? ",
    answer("To estimate a parameter with the most probable value based on the observed data.", correct=TRUE),
    answer("To generate likely data from a probability distribution"),
    answer("To estimate a parameter by approximating with the sample mean"),
    answer("To generate the most likely data from a probability distribution"),
    allow_retry = TRUE,
    correct = paste0("Good")
  )
)

Log Likelihood

Often times, finding the maximum of the likelihood function can be difficult. As a result, we perform a monotone transformation of the function which preserves the same minimizer. The most common monotone transformation and the one you were shown in class is the log likelihood. Hence, instead of maximizing the likelihood function above, we will maximize the log transformation of that function:

[ l(\lambda ; y_1,\ldots,y_n) = \log \Big(\prod_{i=1}^n \frac{\lambda^{y_i}\exp(-\lambda)}{y_i!}\Big). ]

Gradient Descent

Gradient descent is an optimization technique that we can use to find the maximum likelihood. In order to perform this technique, we need to compute the derivative of the Poisson log-likelihood. Write a function called deriv_loglike(lambda,y) that will compute the derivative (reference your class notes). Suppose $\lambda=1$, $y_1=3$, $y_2=5$. What value should your function return? Check that it does

deriv_loglike <- function(lambda, y){
  ##Write Code Here##
}
deriv_loglike(lambda=1,y=c(3,5))
grade_result(
  pass_if(~ identical(.result, 6), "Correct!"),
  fail_if(~ TRUE, "Incorrect.")
)

Performing Gradient Descent

Using your data and my sample code from class, maximize the loglikelihood and report the maximizer. Use gam=.01 and start at 23. Set tol=1e-5. Is this the value you would expect? What should the answer be?

set.seed(07072020)
y = rpois(100,3)
deriv_loglike <- function(lambda, y){
  sum(-1 + y/lambda)
}
set.seed(07072020)
maxiter = 1000
conv = FALSE
gam = ___
lam = ___
tol = ___
for(iter in  1:maxiter){
  lam.new = lam + gam * deriv_loglike(lam, y)
  conv = abs(lam - lam.new) < tol
  lam = lam.new
  if(conv) break
}
lam
set.seed(07072020)
y = rpois(100,3)
deriv_loglike <- function(lambda, y){
  sum(-1 + y/lambda)
}
set.seed(07072020)
maxiter = 1000
conv = FALSE
gam = 0.01
lam = 23
tol = 1e-5
for(iter in  1:maxiter){
  lam.new = lam + gam * deriv_loglike(lam, y)
  conv = abs(lam - lam.new) < tol
  lam = lam.new
  if(conv) break
}

grade_result(
  pass_if(~ identical(.result, lam), "Correct!"),
  fail_if(~ TRUE, "Incorrect.")
)


dajmcdon/ubc-stat406-labs documentation built on Aug. 18, 2020, 1:23 p.m.