The CDF of Gumbel Distribution with location parameter $mu$ and scale parameter $\sigma$ is:

$$F(x | \mu, \sigma) = \frac{1}{\sigma} e^{-(z + e^{-z})}$$

where $z = \frac{x - \mu}{\sigma}$.

In addition, we set $s = log(\sigma)$ so that $s \in \mathbb{R}$.

The implemented Gumbel Regression is:

$$y \approx Gumbel(\mu = x_i^T \beta, \sigma = e^s)$$

Therefore, the loss function which is the average of the negative log likelihood function:

$$\begin{eqnarray} L(s, \beta) &=& \frac{1}{n}\sum_{i=1}^n{z_i + e^{-z_i} + s} \ &=& s + \frac{1}{n}\sum_{i=1}^n{z_i + e^{-z_i}} \ &=& s + \frac{1}{n}\sum_{i=1}^n L'(z_i) \end{eqnarray}$$

where:

In the returned function from get.loss, get.gradient and get.Hv, there is a parameter w which is $(s, \beta)$.

Helper Derivatives

$$\begin{eqnarray} \frac{\partial}{\partial s}z_i = -z_i \end{eqnarray}$$

$$\begin{eqnarray} \frac{\partial}{\partial \beta}z_i = - \frac{x_i}{e^s} \end{eqnarray}$$

$$\begin{eqnarray} \frac{\partial}{\partial z_i}L'(z_i) = 1 - e^{-z_i} \end{eqnarray}$$

Gradient

$$\begin{eqnarray} \frac{\partial}{\partial s}L &=& 1 + \frac{1}{n}\sum_{i=1}^n {\frac{\partial L'(z_i)}{\partial z_i} \times \frac{\partial z_i}{\partial s}} \ &=& 1 + \frac{1}{n}\sum_{i=1}^n{(1 - e^{-z_i}) \times (-z_i)} \ &=& 1 + \frac{1}{n}\sum_{i=1}^n{z_i e^{-z_i} - z_i} \end{eqnarray}$$

$$\begin{eqnarray} \frac{\partial}{\partial \beta}L &=& \frac{1}{n}\sum_{i=1}^n {\frac{\partial L'(z_i)}{\partial z_i} \times \frac{\partial z_i}{\partial \beta}} \ &=& \frac{1}{n}\sum_{i=1}^n {(1 - e^{-z_i}) \times \frac{-x_i}{e^s}} \ &=& \frac{1}{n}\sum_{i=1}^n {\frac{e^{-z_i} - 1}{e^s} x_i} \end{eqnarray}$$

Hessian

$$\begin{eqnarray} \frac{\partial^2}{\partial s^2}L &=& \frac{1}{n}\sum_{i=1}^n{\frac{\partial}{\partial s}(z_i e^{-z_i} - z_i)} \ &=& \frac{1}{n}\sum_{i=1}^n{ z_i^2 e^{-z_i} - z_i e^{-z_i} + z_i} \end{eqnarray}$$

$$\begin{eqnarray} \frac{\partial^2}{\partial s \partial \beta}L &=& \frac{1}{n}\sum_{i=1}^n{\frac{\partial}{\partial \beta}(z_i e^{-z_i} - z_i)} \ &=& \frac{1}{n}\sum_{i=1}^n {\frac{z_i e^{-z_i} - e^{-z_i} +1}{e^s} x_i} \end{eqnarray}$$

$$\begin{eqnarray} \frac{\partial^2}{\partial \beta^2}L &=& \frac{1}{n}\sum_{i=1}^n{\frac{\partial}{\partial \beta}(\frac{e^{-z_i} - 1}{e^s} x_i)} \ &=& \frac{1}{n}\sum_{i=1}^n {\frac{e^{-z_i}}{e^{2s}} x_i x_i^T} \end{eqnarray}$$



wush978/gumbelRegression documentation built on May 9, 2019, 2:29 p.m.