m_estimate | R Documentation |
M-estimation theory provides a framework for asympotic properties of estimators that are solutions to estimating equations. Many R packages implement specific applications of estimating equations. geex aims to be provide a more general framework that any modelling method can use to compute point and variance estimates for parameters that are solutions to estimating equations of the form:
∑_i ψ(O_i, θ) = 0
m_estimate( estFUN, data, units = character(0), weights = numeric(0), outer_args = list(), inner_args = list(), roots = NULL, compute_roots = TRUE, compute_vcov = TRUE, Asolver = solve, corrections, deriv_control, root_control, approx_control )
estFUN |
a function that takes in group-level data and returns a function that takes parameters as its first argument |
data |
a data.frame |
units |
an optional character string identifying the grouping variable in |
weights |
an optional vector of weights. See details. |
outer_args |
a list of arguments passed to the outer (data) function of |
inner_args |
a list of arguments passed to the inner (theta) function of |
roots |
a vector of parameter estimates must be provided if |
compute_roots |
whether or not to find the roots of the estimating equations.
Defaults to |
compute_vcov |
whether or not to compute the variance-covariance matrix.
Defaults to |
Asolver |
a function passed to |
corrections |
an optional list of small sample corrections where each
list element is a |
deriv_control |
a |
root_control |
a |
approx_control |
a |
The basic idea of geex is for the analyst to provide at least two items:
data
estFUN
: (the ψ function), a function that takes unit-level
data and returns a function in terms of parameters (θ)
With the estFUN
, geex computes the roots of the estimating equations
and/or the empirical sandwich variance estimator.
The root finding algorithm defaults to multiroot
to
estimate roots though the solver algorithm can be specified in the rootFUN
argument. Starting values for multiroot
are passed via the
root_control
argument. See vignette("v03_root_solvers", package = "geex")
for information on customizing the root solver function.
To compute only the covariance matrix, set compute_roots = FALSE
and pass
estimates of θ via the roots
argument.
M-estimation is often used for clustered data, and a variable by which to split
the data.frame into independent units is specified by the units
argument.
This argument defaults to NULL
, in which case the number of units equals
the number of rows in the data.frame.
For information on the finite-sample corrections, refer to the finite sample
correction API vignette: vignette("v05_finite_sample_corrections", package = "geex")
a geex
object
An estFUN
is a function representing ψ. geex works
by breaking ψ into two parts:
the "outer" part of the estFUN
which manipulates data
and
outer_args
and returns an
"inner" function of theta
and inner_args
. Internally, this
"inner" function is called psiFUN
.
In pseudo-code this looks like:
function(data, <<outer_args>>){ O <- manipulate(data, <<outer_args>>) function(theta, <<inner_args>>){ map(O, to = theta, and = <<inner_args>>) } }
See the examples below or the package vignettes to see an estFUN
in action.
Importantly, the data
used in an estFUN
is *unit* level data,
which may be single rows in a data.frame or block of rows for clustered data.
Additional arguments may be passed to both the inner and outer function of the
estFUN
. Elements in an outer_args
list are passed to the outer
function; any elements of the inner_args
list are passed to the inner
function. For an example, see the finite sample correction vignette [
vignette("v05_finite_sample_corrections", package = "geex")
].
To estimate roots of the estimating functions, geex uses the rootSolve
multiroot
function by default, which requires starting
values. The root_control
argument expects a root_control
object, which the utility function setup_root_control
aids in
creating. For example, setup_root_control(start = 4)
creates a
root_control
setting the starting value to 4. In general,
the dimension of start
must the same as theta
in the inner
estFUN
.
In some situations, use of weights can massively speed computations. Refer
to vignette("v04_weights", package = "geex")
for an example.
Stefanski, L. A., & Boos, D. D. (2002). The calculus of M-estimation. The American Statistician, 56(1), 29-38.
# Estimate the mean and variance of Y1 in the geexex dataset ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) # compare to the mean() and variance() functions mean(geexex$Y1) n <- nrow(geexex) var(geexex$Y1) * (n - 1)/n # A simple linear model for regressing X1 and X2 on Y4 lm_eefun <- function(data){ X <- cbind(1, data$X1, data$X2) Y <- data$Y4 function(theta){ t(X) %*% (Y - X %*% theta) } } m_estimate( estFUN = lm_eefun, data = geexex, root_control = setup_root_control(start = c(0, 0, 0))) # Compare to lm() results summary(lm(Y4 ~ X1 + X2, data = geexex))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.