library(knitr)
library(tidyverse)
library(cowplot)
opts_chunk$set(echo=FALSE, fig.align='center', fig.width=4, fig.height=3,
               cache=TRUE, autodep=TRUE, cache.comments=FALSE,
               message=FALSE, warning=FALSE)

GD for logistic regression

Suppose $Y=1$ with probability $p(x)$ and $Y=0$ with probability $1-p(x)$. I want to model $P(Y=1| X=x)$. I'll assume that $p(x)/(1-p(x)) = ax$ for some scalar $a$.

We're going to estimate $a$ given data.

First, we need data.

set.seed(20200405)
n = 100
a = 2
x = runif(n)*10-5
logit <- function(x) exp(x)/(1+exp(x))
p = logit(a*x)
df = tibble(x=x, y = rbinom(n, 1, p))
ggplot(df, aes(x,y)) + geom_point(color="red") +
  stat_function(fun=function(x) logit(a*x)) + theme_cowplot(14)

The likelihood is given by [ L(y | a, x) = \prod_{i=1}^n p(x_i)^{y_i}(1-p(x_i))^{1-y_i} ]

(Simple) gradient ascent (to maximize $L(a)$) is:

  1. Input $a_0, \gamma>0, j_\max, \epsilon>0, \nabla L(a)$.
  2. For $j=1,2,\ldots,j_\max$, [ a_j = a_{j-1} + \gamma \nabla L(a_{j-1}) ]
  3. Stop if $\epsilon > |a_j - a_{j-1}|$.

Write a function to find $a_{mle}$

Note that on the log scale, $\nabla L(a) = \sum (y_i - p_i) x_i$ where $p_i$ is as above.

amle <- function(x, y, a0, gam=0.5, jmax=100, eps=1e-6){




}

Run your function and report the result

amle(x, y, 5)
amle(x, y, .1)
amle(x, y, 5, .1)
amle(x, y, 5, 1)
glm(y~x-1, family=binomial)$coef #just to check


dajmcdon/ubc-stat406-labs documentation built on Aug. 18, 2020, 1:23 p.m.