predicted_covariates_aux_first: GMM estimator for models with (imperfectly) predicted...

Description Usage Arguments Value Examples

View source: R/predictionErrorGMM_aux_first.R

Description

Optimally combines OLS and 2SLS on labeled and unlabeled data, given an exclusion restriction. See associated paper for details.

Usage

1
2
3
4
predicted_covariates_aux_first(y, Xu, Xo, Zu, a, v, t, p,
  ER_test_signif_level = 0.05, confint_signif_level = 0.05,
  ER_test = TRUE, include_intercept = TRUE, min_iter = 2,
  max_iter = 25, tol = 0.01, verbose = TRUE)

Arguments

y

vector of n outcome values

Xu

matrix/vector of n possibly unobserved (i.e., NA) covariate values; must be observed when v or t == 1

Xo

matrix/vector of n fully observed covariate values; may be set to NULL, but should never contain a constant term — control the inclusion of a constant term via include_intercept option

Zu

matrix/vector of n fully observed predicted Xu values; must be observed when v or p == 1

a

vector of n 1/0 where a[i] == 1 if unit i is in the auxiliary sample, == 0 otherwise; sum(a) == 0 is allowed

v

vector of n 1/0 where v[i] == 1 if unit i is validation sample, == 0 otherwise; sum(v) > 0 is required

t

vector of n 1/0 where t[i] == 1 if unit i is training sample, == 0 otherwise; sum(t) == 0 is allowed

p

vector of n 1/0 where p[i] == 1 if unit i is primary sample, == 0 otherwise; sum(p) == 0 is allowed, but then it wouldn't make sense to use this package

ER_test_signif_level

default is 0.05; significance level for the ER test warning message, but note that the pvalue itself is not suppreseed

confint_signif_level

default is 0.05; 1 - confint_signif_level determines the confidence level for the provided GMM estimator confidence intervals

ER_test

default is TRUE; if TRUE, uses the validation sample to test the required exclusion restriction: ‘E(epsilon z_u) = 0’ via Sargan's J-test results from the gmm package; see our paper for construction of the test

include_intercept

default is TRUE; if TRUE, will append a columns of ones to Xo, inducing an intercept term in the model y ~ X

min_iter

default is 2; min_iter - 1 determines how many iterations the GMM weighting matrix is based on both unlabeled and labeled data

max_iter

default is 25; determines how many iterations the GMM estimator will take before quitting

tol

default is 0.01; the algorithm uses [min_iter, max_iter] iterations, stopping if percent changes in beta are < tol for each element of beta

verbose

default is TRUE; tells the function whether to print convergence and other warnings

Value

A list of

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
set.seed(pi / 2)
n <- 2e5
n_a <- 1500
n_v <- 150
n_t <- 100
n_p <- n - n_a - n_v - n_t

v <- as.numeric((1:n) %in% (1:n_v))
t <- as.numeric((1:n) %in% (n_v + 1:n_t))
p <- as.numeric((1:n) %in% (n_v + n_t + 1:n_p))
a <- as.numeric((1:n) %in% (n_v + n_t + n_p + 1:n_a))

beta_true <- c(0.2, 0.4, 0.3)
sigma <- 1.0 * 10

Xu <- rnorm(n)
Xo <- rnorm(n)
epsilon <- sigma * rnorm(n)
y <- cbind(Xu, Xo, 1) %*% beta_true + epsilon
Zu <- Xu + rnorm(n) # Zu predicts Xu without being correlated with epsilon

Zu[t == 1] <- NA
Xu[p == 1] <- NA
y[a == 1] <- NA

predicted_covariates_aux_first(y, Xu, Xo, Zu, a, v, t, p)
predicted_covariates(y[a == 0], Xu[a == 0], Xo[a == 0],
Zu[a == 0], v[a == 0], t[a == 0], p[a == 0])

matthewtyler/predictionError documentation built on Oct. 8, 2019, 7:47 p.m.