Description Usage Arguments Details Value References See Also Examples
This implements quasi maximum likelihood estimation for parameters in semiparametric binary response models.
1 2 3 4 5 |
x |
a numeric matrix of explanatory variables. |
y |
a vector of integer, numeric, or factor of binary response outcomes, taking either 1 or 0 only. |
r |
a numeric number that controls the size of Silverman's rule-of-thumb bandwidth,
|
tau |
a numeric indicating cut-off levels for trimming in
|
... |
further arguments in " |
formula |
a formula describing the model to be fitted.. |
data |
a data.frame containing variables in |
This is the main function in the pacakge that performs parameter estimation of
semiparametric binary response models. It can take as arguments either matrix x
and
vector y
or formula
and data
to run estimation (see Examples below). The
default setup is reasonably good so that simply offering x
and y
or
formula
and data
would be enough in many cases.
Currently, only a single index model for binary outcome y
is allowed. Importantly,
the first explanatory variable should be the one whose coefficient is strongly believed to be
different from zero, as the coefficients of other variables will be rescaled by that
of the first explanatory variable in estimation. This rescaling is unavoidable
in semiparametric approaches while it has "no" impact on estimation of conditional probability,
Pr{y=1|x}
. The parameter estimates are found in quasi maximum likelihood estimation
using maxLik::maxLik
with BFGS method in place.
The theory in \insertCiteklein1993efficient;textualsemiBRM needs a well-defined trimming
indicator of 'index' that trims out boundary points to ensure a compact support. For it,
this runs estimation twice, as recommended in the paper, with the first estimation as 'pilot'
version and with the second one as the primary one. In the pilot version, the initial trimming
indicator is generated based on the original set of explanatory variables x
, dropping out
observations near boundaries in any of the explanatory variables x
, where observations
lying outside [
trimming.level*100
, (1-trimming.level)*100
]
percentiles are
considered as being near boundaries, with the default value trimming.level = 0.025
. Then,
the coefficient estimates of the pilot version are used to form the index, the linear combination
of explanatory variables with the estimated coefficients from the pilot version. Finally,
the trimming indicator for the primary version is generated from it at trimming.level
and
taken to the log-likelihood function for parameter estimation.
The Silverman's rule of thumb bandwidth is put in place for the Nadaraya-Watson estimator, which
computes conditional probabilities. The bandwidth size is controlled by r
with default value
r = 6.01
, which satisfies conditions for consistency and asymptotic normality.
The package deploys OpenMP here, parallelizing computation of the Nadaraya-Watson estimator over
data points. The default value of the number of threads is parallel::detectCores()-1L
.
To change it manually, please use set_num_threads(x)
. Note that this affects all
functions that employ the Nadaraya-Watson estimator in the package. If set to be 1L,
multithreading will not be used.
object of class 'semiBRM' similar to that of 'maxLik' with elements:
estimate
: estimated parameter values.
log.likelihood
: log likelihood at the estimates.
gradient
: a gradient vector at the estimates.
hessian
: a hessian matrix at the estimates.
code
: return code as detailed in maxLik::maxLik
.
message
: a short message describing return code.
iter
: the number of iterations performed for numerical optimization.
control
: the optimization control parameters as detailed in
maxLik::maxLik
.
model
: the model frame.
r
: the bandwidth parameter for Silverman's rule-of-thumb bandwidth.
trimming.level
: the trimming cutoff level, which is tau
in function argument.
call
: the matched call.
formula
: the formula entered for estimation.
GaussianNadarayaWatsonEstimator, TrimmingIndicator
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | # data generating process
N <- 500L
X1 <- rnorm(N)
X2 <- (X1 + 2*rnorm(N))/sqrt(5) + 1
X3 <- rnorm(N)^2/sqrt(2)
X <- cbind(X1, X2, X3)
beta <- c(2, 2, -1, -1)
V <- as.vector(cbind(X, 1)%*%beta)
Y <- ifelse(V >= rnorm(N), 1L, 0L)
# identifiable set of parameters
ests_true <- c(1, -.5)
# using matrix/vector
qmle0 <- semiBRM(x = X, y = Y, control = list(iterlim = 50))
# using formula and data
data <- data.frame(Y, X1, X2, X3)
qmle1 <- semiBRM(Y ~ X1 + X2 + X3, data = data, control = list(iterlim = 50))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.