Description Usage Arguments Details Value Author(s) References Examples
BiProbitPartial
estimates a bivariate probit with partial
observability model.
The bivariate probit with partial observability model is defined as follows. Let i denote the ith observation which takes values from 1 to N, X[1] be a covariate matrix of dimension N x k[1], X[2] be a covariate matrix of dimension N x k[2], X[1i] be the ith row of X[1], X[2i] be the ith row of X[2], β[1] be a coefficient vector of length k[1] and β[2] be a coefficient vector of length k[2]. Define the latent response for stage one to be
y[1i]* = X[1i] β[1] + ε[1i]
and stage two to be
y[2i]* = X[2i] β[2] + ε[2i].
Note the stages do not need to occur sequentially. Define the outcome of the first stage to be y[1i] = 1 if y[1i]* > 0 and y[1i] = 0 if y[1i]* <= 0. Define the outcome of the second stage to be y[2i] = 1 if y[2i]* > 0 and y[2i] = 0 if y[2i]* <= 0. The observed outcome is the product of the outcomes from the two stages
z[i] = y[1i] y[2i].
The pair (ε[1i],ε[2i]) is distributed independently and identically multivariate normal with means E(ε[1i]) = E(ε[2i]) = 0, variances Var(ε[1i]) = Var(ε[2i]) = 1, and correlation (or equivalently covariance) Corr(ε[1i],ε[2i]) = ρ. A more general structural representation is presented in Poirier (1980).
The model can be estimated by Bayesian Markov Chain Monte Carlo (MCMC) or frequentist maximum likelihood methods. The correlation parameter ρ can be estimated or fixed. The MCMC algorithm used is a block Gibbs sampler within Metropolis-Hastings scheme developed by Rajbhandari (2014). The default maximum likelihood method is based off the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm. A modification of the algorithm is used to include box constraints for when ρ is estimated. See optimr for details.
1 2 |
formula |
an object of class Formula: a symbolic description of the model to be fitted. The details of model specification are given under 'Details'. |
data |
an optional data frame, list or environment (or object coercible
by as.data.frame to a data frame) containing the variables in the
model. If not found in |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data
contain |
philosophy |
a character string indicating the philosophy to be used
for estimation. For Bayesian MCMC estimation |
control |
a list of control parameters. See 'Details'. |
Models for BiProbitPartial
are specified symbolically. A typical
model has the form response ~ terms1 | terms2
where response
is the name of the (numeric binary) response vector and terms1
and terms2
are each a series of terms which specifies a linear predictor for latent response
equations 1 and 2. A terms1
specification of the form first + second
indicates all the terms in first
together with all the terms
in second
with duplicates removed. A specification of the form
first:second
indicates the set of terms obtained by taking the
interactions of all terms in first
with all terms in second
.
The specification first*second
indicates the cross of
first
and second
. This is the same as
first + second + first:second
. Likewise for terms2
.
A Formula has an implied intercept term for both equations. To remove the
intercept from equation 1 use either response ~ terms1 - 1 | terms2
or response ~ 0 + terms1 | terms2
. It is analgous to remove the
intercept from the equation 2.
If philosophy = "bayesian"
is specified then the model is
estimated by MCMC methods based on Rajbhandari (2014). The prior for the
parameters in equations 1 and 2 is multivariate normal with mean beta0
and covariance B0
. The prior for ρ is truncated normal on the
interval [-1,1] with mean parameter rho0
and variance parameter
v0
and is assumed to be apriori independent of the parameters in
equations 1 and 2.
If philosophy = "frequentist"
then the model is
estimated by frequentist maximum likelihood using optimr
from the package
optimr.
The control
argument is a list that can supply the tuning parameters
of the Bayesian MCMC estimation and frequentist maximum likelihood
estimation algorithms. For frequentist maximum likelihood the control
argument is passed directly to control
in the function optimr
from the package optimr. If one wants to specify the method
for
the function optimr
then method
must be passed as an element
of control
. See optimr for further details.
The
control
argument can supply any of the following components for
Bayesian MCMC estimation.
Numeric vector or list of nchains
elements each a numeric vector supplying starting values for the coefficients
in equations 1 and 2. For each vector, the first k[1] values are for the
coefficients in the first equation. The second k[2] values are for
the coefficients in the second equation. Default is beta = numeric( k1 + k2 )
,
a vector of zeros.
Numeric or list of nchains
elements each a numeric starting
value for ρ. Default is rho = 0
.
Logical value to determine if ρ is estimated. If
fixrho = TRUE
then ρ is fixed at value rho
. Default
is fixrho = FALSE
.
Number of MCMC iterations. Default is S = 1000
. For
philosophy = "bayesian"
only.
Number of initial pre-thinning MCMC iterations to remove after
estimation. Default is burn = floor(S/2)
, the floor of the number
of MCMC iterations divided by 2. For philosophy = "bayesian"
only.
Positive integer to keep every thin
post-burn in MCMC draw
and drop all others. Default is thin = 1
, keep all post burn-in draws.
For philosophy = "bayesian"
only.
Positive integer for nchains = 1
or list of nchains
elements each a positive integer fixing the seed of the random number generator.
Typically used for replication. Default is seed = NULL
, no seed. For
philosophy = "bayesian"
only.
Positive integer specifying the number of MCMC chains. Default is
nchains = 1
. For philosophy = "bayesian"
only.
Numeric vector supplying the prior mean for the coefficients of
equations 1 and 2. The first k[1] components are for the coefficients
of equation 1. The second k[2] components are for the coefficients of
equation 2. Default is beta0 = numeric( k1 + k2 )
, a vector of zeros.
For philosophy = "bayesian"
only.
Numeric matrix supplying the prior covariace of the parameters of
equations 1 and 2. The first k[1] rows are for the parameters of
equation 1. The second k[2] rows are for the parameters of equation
2. Likewise for columns. If unspecified the default is set such that the
inverse of B0 is a zero matrix of dimension (k[1]+k[2]) x (k[1]+k[2]), a 'flat' prior. For
philosophy = "bayesian"
only.
Numeric value supplying a prior parameter for ρ which is
the mean of a normal distribution that is truncated to the interval
[-1,1]. Default is rho0 = 0
. For
philosophy = "bayesian"
only.
Numeric value supplying a prior parameter for ρ which is
the variance of a normal distribution that is truncated to the interval
[-1,1]. Default is v0 = 1
. For
philosophy = "bayesian"
only.
Numeric degrees of freedom parameter for setting the degrees of
freedom for ρ's proposal t-distribution. Default is nu = 10
.
Numeric scaling parameter for scaling ρ's proposal t-distribution.
Default is tauSq = 1
.
Determines how aggressive proposal draws for ρ are.
Set to P = 0
normal or P = -1
for aggresive. See Rajbhandari
(2014) and for details. Default is P = 0
. For philosophy =
"bayesian"
only.
Numeric value determining the value of intermediate reporting. A negative value is no reporting, larger positive values provide higher degrees of reporting.
Note: If the Bayesian MCMC chains appear to not be converging and/or frequentist
maximum likelihood produces errors with summary
, the model may be
unidentified. One possible solution is to add regressors to the first equation
that are exluded from the second equation or visa-versa. See Poirier (1980) for
more details.
BiProbitPartial
returns an S x (k[1]+k[2]+1) x nchains array of MCMC draws of primary
class mcmc.list
and secondary class BiProbitPartialb
, if
philosophy = "bayesian"
. Each element in the
first dimension represents a MCMC draw. The first k[1] elements
in the second dimension are draws for the coefficientss in the first
equation. The next k[2] elements of the second dimension are draws
for the coefficients in the second equation. The last element of the second
dimension are draws for the correlation parameter. The elements of the
third dimension are the chains. If ρ was fixed
(fixrho = TRUE
) then each draw for the last element in the second
dimension is returned as the value it was fixed at (the starting value,
rho
).
If philosophy = "frequentist"
a list equivalent to the
output optimr with primary class optimrml
and secondary
class BiProbitPartialf
.
BiProbitPartial
was written by Michael Guggisberg. The majority of
the MCMC estimation was written by Amrit Romana based on Rajbhandari
(2014). The development of this package was partially funded by the
Institude for Defense Analyses (IDA).
Poirier, Dale J. (1980). "Partial Observability in bivariate probit models" Journal of Econometrics 12, 209-217. (Identification)
Rajbhandari, Ashish (2014). "Identification and MCMC estimation of bivariate probit model with partial observability." Bayesian Inference in Social Sciences (eds I. Jeliazkov and X. Yang). (MCMC algorithm)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | data('Mroz87',package = 'sampleSelection')
Mroz87$Z = Mroz87$lfp*(Mroz87$wage >= 5)
f1 = BiProbitPartial(Z ~ educ + age + kids5 + kids618 + nwifeinc | educ + exper + city,
data = Mroz87, philosophy = "frequentist")
summary(f1)
# Use the estimates from the frequenist philosophy as starting values
b1 = BiProbitPartial(Z ~ educ + age + kids5 + kids618 + nwifeinc | educ + exper + city,
data = Mroz87, philosophy = "bayesian",
control = list(beta = f1$par[1:(length(f1$par)-1)], rho = tail(f1$par,1)))
summary(b1)
## Not run: #The example used in the package sampleSelection is likely unidentified for
this model
f2 = BiProbitPartial(Z ~ educ + age + kids5 + kids618 + nwifeinc | educ,
data = Mroz87, philosophy = "frequentist") #crashes
summary(f2) #crashes (f2 non-existent)
# Bayesian methods typically still work for unidentified models
b2 = BiProbitPartial(Z ~ educ + age + kids5 + kids618 + nwifeinc | educ,
data = Mroz87, philosophy = "bayesian",
control = list(beta = f1$par[1:(length(f1$par)-3)], rho = tail(f1$par,1)))
summary(b2)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.