hetprobit: Heteroscedastic Probit Regression Models

Description Usage Arguments Details Value References See Also Examples

Description

Fitting heteroscedastic probit models via maximum likelihood.

Usage

1
2
3
4
5
6
7
8
hetprobit(formula, data, subset, na.action, 
  model = TRUE, y = TRUE, x = FALSE, 
  control = hetprobit_control(...), ...)

hetprobit_fit(x, y, z = NULL, control)

hetprobit_control(maxit = 5000, start = NULL, 
  grad = TRUE, hessian = TRUE, ...)

Arguments

formula

a formula expression of the form y ~ x | z where y is the response and x and z are regressor variables for the mean and scale model.

data

a data frame containing the variables occurring in the formulas. If not found in the data, variables are taken from environment(formula).

subset

an optional vector specifying a subset of observations to be used for fitting.

na.action

a function which indicates what should happen when the data contain NAs.

model, y, x

logicals. If TRUE the corresponding model frame, response vector and model matrix are returned as components of the return value. For hetprobit_fit: x is a design matrix with regressors for the mean and y should be a binary response vector with zeros and ones.

z

design matrix with regressors for the scale model.

...

arguments to be used to form the default control argument if it is not supplied directly.

control, maxit, start

a list of control parameters passed to optim: maxit specifies the maximal number of iterations, default is set to 5000 iterations. start an optional vector with starting values.

grad

logical. Should gradients be used for optimization? If TRUE, the default method is "BFGS". Otherwise method = "Nelder-Mead" is used.

hessian

logical or character. Should a numeric approximation of the (negative) Hessian matrix be computed? Either FALSE (or equivalently "none") or TRUE. Alternatively, in the latter case, the default option is hessian = "numDeriv" so that hessian is used for approximating the Hessian. Another option is hessian = "optim" to signal that the Hessian should be computed by optim.

Details

In standard probit regression models the probability of a success, i.e. P(Y = 1), is modelled as

P(Y = 1) = pi = F(x'b)

where F() is the cdf of a standard normal distribution.
The general assumption that the variance of the error term is constant and due to identifiability set to one is relaxed in the heteroscedastic probit model. The variance can vary systematically and is now modelled as a multiplicative function of regressor variables, i.e. sigma = exp(z'g). The probability of success is now represented by

pi = F(x'b / exp(z'g)).

Note that the scale model is only identified without intercept.

A set of standard extractor functions for fitted model objects is available for objects of class "hetprobit", including methods to the generic functions print, summary, coef, vcov, logLik, residuals, predict, terms, model.frame, model.matrix, update, estfun and bread (from the sandwich package), and getSummary (from the memisc package, enabling mtable).

See predict.hetprobit and coef.hetprobit for more details on some methods with non-standard arguments.

This is a somewhat simpler reimplementation of hetglm from the glmx package (Zeileis, Koenker, Doebler 2015). Compared to hetglm, hetprobit does not offer: analytical Hessian and flexible link functions for the mean and scale submodel, among further features.

Value

An object of class "hetprobit", i.e. a list with the following components:

coefficients

a list with elements mean and scale that contain the estimated coefficients from the corresponding model,

loglik

the value of the log-likelihood of the fitted model,

counts

count of function and gradient evaluations from optim,

convergence

an integer. 0 indicates successful completion. For details see optim.

message

optional further information from optim,

vcov

covariance matrix of all parameters in the model,

fitted.values

a list with elements mean and scale containing the latent fitted means and standard deviations,

residuals

a vector of raw residuals (observed - fitted),

method

the method argument passed to the optim call,

nobs

number of observations,

df

degrees of freedom,

call

the original function call,

formula

the original model formula,

terms

a list with elements mean, scale and full containing the terms objects for the respective models,

levels

a list with elements mean, scale and full, that contain the levels of the categorical regressors,

contrasts

a list with elements mean and scale that contain the contrasts corresponding to levels from the respective model,

model

the full model frame if model = TRUE,

y

the numeric response vector if y = TRUE,

x

default is set to x = FALSE. If x = TRUE a list with elements mean and scale that contain the model matrices from the respective models is returned.

References

Alvarez R.M. and Brehm J. (1995) American Ambivalence Towards Abortion Policy: Development of a Heteroskedastic Probit Model of Competing Values. American Journal of Political Science, 39(4), 1055–1082.

Greene W.H. (2012) “Econometric Analysis”, Pearson, Prentice Hall, Seventh Edition.

Harvey A.C. (1976) Estimating Regression Models with Multiplicative Heteroscedasticity. Econometrica, 44(3), 461–465.

Keele L.J. and Park D.K. (2006) Ambivalent about Ambivalence: A Re-examination of Heteroskedastic Probit Models. Unpublished manuscript, Penn State University.

Zeileis A., Koenker R. and Doebler P. (2015) glmx: Generalized Linear Models Extended. R package version 0.1-1. https://cran.r-project.org/package=glmx

See Also

Formula, glmx, predict.hetprobit, coef.hetprobit

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
## packages
require("glmx")

## artificial example
## data-generating process
dgp <- function(n = 500, coef = c(0.5, -1.5, 0, 1, 0)) {
  d <- data.frame(
    x1 = runif(n, -1, 1),
    x2 = runif(n, -1, 1)
  )
  d$ystar <- rnorm(n,
    mean = coef[1] + coef[2] * d$x1 + coef[3] * d$x2,
    sd = exp(coef[4] * d$x1 + coef[5] * d$x2)
  )
  d$y <- ifelse(d$ystar > 0, 1, 0)
  return(d)
}

## data
set.seed(2017-05-20)
d <- dgp()

## model fitting:
## m0 with hetglm.fit from glmx package 
## m1 with function hetprobit()
m0 <- hetglm(y ~ x1 + x2, data = d)
m1 <- hetprobit(y ~ x1 + x2, data = d)

## comparison of coefficients
cbind("hetglm" = coef(m0), "hetprobit" = coef(m1))

## comparison of log-likelihoods
cbind(logLik(m0), logLik(m1))

## Not run: 
## data on voter turnout in U.S. presidential elections 1984
data("VoterTurnout", package = "hetprobit")

## heteroscedastic probit model 
## with years of education in the scale model 
m <- hetprobit(vote ~ education + age + south + govelection + closing | 
                education, data = VoterTurnout) 

summary(m)

## End(Not run)

hetprobit documentation built on May 2, 2019, 5:19 p.m.