In jrising/hierlm: Hierarchical Linear Models

The hierlm package allows researchers to perform a large range of hierarchical models, in which parameters are related to eachother or to 'hyper'-parameters. The technique is very similar to Bayesian Hierarchical Modeling (See Bayesian Data Analysis, chapter 5, by Gelman et al.), but because it uses a linear model (OLS) can handle much larger datasets.

Some possible use cases are:

You have data on several similar items (e.g., products at a grocery store), and you want fitted parameters that describe them to mutually inform each other.
You want to place a fitting cost on having certain parameters differ from one another.

We will consider the case of a model of coffee growth, where data comes from several different countries. We want the model to be able to vary by country, particularly where the data suggests it. However, we also want to bring all of the data to bear on the general range for each of the parameters, and for those global results to have a large role where there is little country-specific data.

Model description

We would like to satisfy the following constraints:

Each variety of coffe in each country should have its own parameters, e.g., $\gamma_{iv}$.
We want to partially pool across countries for a given variety: $\gamma_{iv} \sim \mathcal{N}(\gamma_v, \tau_{\gamma_v})$.
We want to partially pool variety hyperparameters: $\gamma_v \sim \mathcal{N}(\gamma, \tau_\gamma)$.
We want to do this for several parameters.
We need to use an approach faster than full Bayesian hierarchical modeling.

The pooled model (that is, with a single parameter across all countries), looks like this:

$$ \log{y_{it}} = \alpha_i + beta_v + \gamma g_{it} + \kappa k_{it} + \phi f_{it} + \pi p_{it} + \psi p_{it}^2 + \epsilon_{it} $$

A fully unpooled model (with independent parameters for each country and variety) is:

$$ \log{y_{ivt}} = \alpha_i + \beta_v + \gamma_{iv} g_{it} + \kappa_{iv} k_{it} + \phi_{iv} f_{it} + \pi_{iv} p_{it} + \psi_{iv} p_{it}^2 + \epsilon_{ivt} $$

We can augment the fully unpooled model with "partial pooling relationships", each with their own error term. By simultaneously minimizing these error terms, we get a partial pooling fit.

$$ \begin{aligned} \gamma_{iv} &= \gamma_v + \epsilon_{iv}\ \gamma_a &= \gamma_c + \epsilon_a\ \gamma_r &= \gamma_c + \epsilon_r\ &\vdots \end{aligned} $$

The magic is in rewriting the original unpooled model so that it admits additional parameters, and then to create 'fictional observations'. These parameters have an independent variable set to 0 for all of the original lines. Correspondingly, the dependent variable and the constant or fixed effect terms are set to 0 in the partial pooling fictional observations.

$$ \begin{array}{rllll} \log{y_{ivt}} &= \alpha_i &+ \gamma_{iv} g_{it} & &+ \cdots \ \hline \log{y_{ivt}} &= \sum_j \alpha_j \mathbf{1}{j = i} &+ \sum{ju} \gamma_{ju} g_{it} \mathbf{1}{ju = iv} &+ \gamma_a 0 + \gamma_r 0 + \gamma_c 0 &+ \cdots \ \hline 0 &= \sum_j \alpha_j 0 &+ \sum{ju} \gamma_{ju} \mathbf{1}{ju = 1a} &- \gamma_a 1 - \gamma_r 0 - \gamma_c 0 &+ \cdots \ 0 &= \sum_j \alpha_j 0 &+ \sum{ju} \gamma_{ju} \mathbf{1}{ju = 2a} &- \gamma_a 1 - \gamma_r 0 - \gamma_c 0 &+ \cdots \ & & \vdots & & \ 0 &= \sum_j \alpha_j 0 &+ \sum{ju} \gamma_{ju} \mathbf{1}{ju = 1r} &- \gamma_a 0 - \gamma_r 1 - \gamma_c 0 &+ \cdots \ & & \vdots & & \ 0 &= \sum_j \alpha_j 0 &+ \sum_ju \gamma{ju} \mathbf{1}{ju = 1c} &- \gamma_a 0 - \gamma_r 0 - \gamma_c 1 &+ \cdots \ & & \vdots & & \ 0 &= \sum_j \alpha_j 0 &+ \sum_ju \gamma{ju} 0 &+ \gamma_a 1 + \gamma_r 0 - \gamma_c 1 &+ \cdots \ 0 &= \sum_j \alpha_j 0 &+ \sum_ju \gamma_{ju} 0 &+ \gamma_a 0 + \gamma_r 1 - \gamma_c 1 &+ \cdots \ \end{array} $$

The code to fit the model looks like this:

    model <- hierlm(logyield ~ 0 + region + variety + regionvariety:gdd1000 |
                    regionvariety:gdd1000 > variety:gdd1000 |
                    varietyarabica:gdd1000 == varietycombined:gdd1000 |
                    varietyrobusta:gdd1000 == varietycombined:gdd1000,
                    data, ratios=c(.1, .1, .1))
    summary(model)