Hurdle Demand Models

knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>",
    dev = "png",
    dpi = 144,
    fig.width = 7,
    fig.height = 5,
    warning = FALSE,
    message = FALSE
)

library(beezdemand)

Introduction

The beezdemand package includes functionality for fitting two-part mixed effects hurdle demand models using Template Model Builder (TMB). This approach is particularly useful when:

When to Use Hurdle Models vs. Standard Models

| Scenario | Recommended Approach | |----------|---------------------| | Few zeros, zeros are measurement artifacts | fit_demand_fixed() or fit_demand_mixed() | | Many zeros, zeros represent true non-consumption | fit_demand_hurdle() | | Need to understand factors affecting whether someone consumes | fit_demand_hurdle() | | Need individual-level estimates of "quitting price" | fit_demand_hurdle() |

Model Specification

The hurdle model has two parts:

Part I: Binary Model (Probability of Zero Consumption)

$$\text{logit}(\pi_{ij}) = \beta_0 + \beta_1 \cdot \log(\text{price} + \epsilon) + a_i$$

Where:

Part II: Continuous Model (Consumption Given Positive)

With 3 random effects: $$\log(Q_{ij}) = (\log Q_0 + b_i) + k \cdot (\exp(-(\alpha + c_i) \cdot \text{price}) - 1) + \varepsilon_{ij}$$

With 2 random effects: $$\log(Q_{ij}) = (\log Q_0 + b_i) + k \cdot (\exp(-\alpha \cdot \text{price}) - 1) + \varepsilon_{ij}$$

Where:

Random Effects Structure

The random effects follow a multivariate normal distribution:

Getting Started

Data Requirements

Your data should be in long format with columns for:

library(beezdemand)

# Example data structure
knitr::kable(
    head(apt, 10),
    caption = "Example APT data structure (first 10 rows)"
)

Basic Model Fitting

# Fit 2-RE model (simpler, faster)
fit2 <- fit_demand_hurdle(
    data = apt,
    y_var = "y",
    x_var = "x",
    id_var = "id",
    random_effects = c("zeros", "q0"), # 2 random effects
    verbose = 0
)
# Fit 3-RE model (more flexible)
fit3 <- fit_demand_hurdle(
    data = apt,
    y_var = "y",
    x_var = "x",
    id_var = "id",
    random_effects = c("zeros", "q0", "alpha"), # 3 random effects
    verbose = 1
)

Interpreting Output

# View summary
summary(fit2)

# Extract coefficients
coef(fit2)

# Standardized tidy summaries
tidy(fit2) |> head()
glance(fit2)

# Get subject-specific parameters
head(get_subject_pars(fit2))

Diagnostics

Use check_demand_model() and the plotting helpers for quick post-fit checks.

check_demand_model(fit2)
plot_residuals(fit2)$fitted
plot_qq(fit2)

Understanding Results

Fixed Effects

| Parameter | Interpretation | |-----------|---------------| | beta0 | Part I intercept: baseline log-odds of zero consumption | | beta1 | Part I slope: change in log-odds per unit increase in log(price) | | logQ0 | Log of intensity parameter (population average) | | k | Scaling parameter for demand decay | | alpha | Elasticity parameter (population average for 2-RE, mean for 3-RE) |

Subject-Specific Parameters

The subject_pars data frame contains:

| Parameter | Description | |-----------|-------------| | a_i | Random effect for Part I (zeros probability) | | b_i | Random effect for Part II (intensity) | | c_i | Random effect for alpha (3-RE model only) | | Q0 | Individual intensity: $\exp(\log Q_0 + b_i)$ | | alpha | Individual elasticity: $\alpha + c_i$ (or just $\alpha$ for 2-RE) | | breakpoint | Price where P(zero) = 0.5: $\exp(-(\beta_0 + a_i) / \beta_1) - \epsilon$ | | Pmax | Price at maximum expenditure | | Omax | Maximum expenditure |

Model Selection: 2-RE vs 3-RE

When to Use Each

Likelihood Ratio Test

# Compare nested models
compare_hurdle_models(fit3, fit2)

# Unified model comparison (AIC/BIC + LRT when appropriate)
compare_models(fit3, fit2)

# Output:
# Likelihood Ratio Test
# =====================
#            Model n_RE    LogLik df     AIC     BIC
#    Full (3 RE)    3 -1234.56 12 2493.12 2543.21
# Reduced (2 RE)    2 -1245.78  9 2509.56 2547.89
#
# LR statistic: 22.44
# df: 3
# p-value: 5.2e-05

A significant p-value suggests the 3-RE model provides a better fit.

Visualization

# Population demand curve
plot(fit2, type = "demand")
# Probability of zero consumption
plot(fit2, type = "probability")
# Distribution of individual parameters
plot(fit2, type = "parameters")
plot(fit2, type = "parameters", parameters = c("Q0", "alpha", "Pmax"))

# Individual demand curves
plot(fit2, type = "individual", subjects = c("1", "2", "3", "4"))

# Single subject with observed data
plot_subject(fit2, subject_id = "1", show_data = TRUE, show_population = TRUE)

Simulation and Validation

Simulating Data

The simulate_hurdle_data() function generates data from the hurdle model:

# Simulate with default parameters
sim_data <- simulate_hurdle_data(
    n_subjects = 100,
    seed = 123
)

head(sim_data)
#   id    x         y delta       a_i        b_i
# 1  1 0.00 12.345678     0 -0.523456  0.1234567
# 2  1 0.50 11.234567     0 -0.523456  0.1234567
# ...

# Custom parameters
sim_custom <- simulate_hurdle_data(
    n_subjects = 100,
    logQ0 = log(15), # Q0 = 15
    alpha = 0.1, # Lower elasticity
    k = 3, # Higher k (ensures Pmax exists)
    stop_at_zero = FALSE, # All prices for all subjects
    seed = 456
)

Monte Carlo Simulation Studies

The run_hurdle_monte_carlo() function assesses model performance through simulation.

Note: Monte Carlo simulations are computationally intensive and not run during vignette building. The example below shows typical usage and expected output format.

# Run Monte Carlo study
mc_results <- run_hurdle_monte_carlo(
    n_sim = 100, # Number of simulations
    n_subjects = 100, # Subjects per simulation
    n_random_effects = 2, # 2-RE model
    verbose = TRUE,
    seed = 123
)

# View summary
print_mc_summary(mc_results)

# Monte Carlo Simulation Summary
# ==============================
#
# Simulations: 100 attempted, 95 converged (95.0%)
#
#    Parameter True Mean_Est   Bias Rel_Bias% Emp_SE Mean_SE SE_Ratio Coverage_95%  N
#        beta0 -2.0    -2.01  -0.01      0.5   0.12    0.11     0.92         94.7 95
#        beta1  1.0     1.02   0.02      2.0   0.08    0.08     1.00         95.8 95
#        logQ0  2.3     2.29  -0.01     -0.4   0.05    0.05     1.00         94.7 95
#            k  2.0     2.03   0.03      1.5   0.15    0.14     0.93         93.7 95
#        alpha  0.5     0.51   0.01      2.0   0.04    0.04     1.00         95.8 95
# ...

Interpreting Monte Carlo Results

| Metric | Target | Interpretation | |--------|--------|---------------| | Bias | ~0 | Estimates should be unbiased | | Relative Bias | < 5% | Acceptable bias relative to true value | | SE Ratio | ~1.0 | Model SEs match empirical variability | | Coverage 95% | ~95% | CIs should contain true value 95% of time |

Integration with beezdemand Workflow

Combining with Other Analyses

# Fit hurdle model
hurdle_fit <- fit_demand_hurdle(data,
    y_var = "y", x_var = "x", id_var = "id",
    random_effects = c("zeros", "q0"), verbose = 0
)

# Extract subject parameters
hurdle_pars <- get_subject_pars(hurdle_fit)

# These can be merged with other analyses
# e.g., correlate with individual characteristics
cor(hurdle_pars$Q0, subject_characteristics$age)
cor(hurdle_pars$breakpoint, subject_characteristics$dependence_score)

Exporting Results

# Subject parameters
write.csv(get_subject_pars(hurdle_fit), "hurdle_subject_parameters.csv")

# Summary statistics
summ <- get_hurdle_param_summary(hurdle_fit)
print(summ)

Technical Details

TMB Backend

The hurdle model is implemented using Template Model Builder (TMB), which provides:

Control Parameters

fit <- fit_demand_hurdle(
    data,
    y_var = "y",
    x_var = "x",
    id_var = "id",
    epsilon = 0.001, # Constant for log(price + epsilon)
    tmb_control = list(
        max_iter = 200, # Maximum iterations
        eval_max = 1000, # Maximum function evaluations
        trace = 0 # Optimization trace level
    ),
    verbose = 1 # 0 = silent, 1 = progress, 2 = detailed
)

Custom Starting Values

For difficult optimization problems:

custom_starts <- list(
    beta0 = -3.0,
    beta1 = 1.5,
    logQ0 = log(8),
    k = 2.5,
    alpha = 0.1,
    logsigma_a = 0.5,
    logsigma_b = -0.5,
    logsigma_e = -1.0,
    rho_ab_raw = 0
)

fit <- fit_demand_hurdle(data,
    y_var = "y", x_var = "x", id_var = "id",
    start_values = custom_starts, verbose = 1
)

References

Zhao, T., Luo, X., Chu, H., Le, C.T., Epstein, L.H. and Thomas, J.L. (2016), A two-part mixed effects model for cigarette purchase task data. Jrnl Exper Analysis Behavior, 106: 242-253. https://doi.org/10.1002/jeab.228

Hursh, S. R., & Silberberg, A. (2008). Economic demand and essential value. Psychological Review, 115(1), 186-198.

See Also



Try the beezdemand package in your browser

Any scripts or data that you put into this service are public.

beezdemand documentation built on March 3, 2026, 9:07 a.m.