propagate: Propagate uncertainty through the method of composition

Description Usage Arguments Value Author(s) References Examples

Description

This function propagates uncertainty from variables measured with error by estimating a model on many samples of a dataset and then sampling from the distribution of the model parameters, a procedure known as the "method of composition" (Tanner 1996, 52; Treier and Jackman 2008).

Usage

1
2
3
4
5
6
7
8
9
propagate(
  data,
  model,
  iter_var = "iteration",
  iter_values = NULL,
  vc_fun = stats::vcov,
  rsq = TRUE,
  prog_int = 50
)

Arguments

data

A data frame containing multiple samples from the measurement-error distribution of one or more variables.

model

A fitted model object, from a model previously estimated (e.g., on the whole dataset or one sample).

iter_var

The name of the variable in data that identifies different samples.

iter_values

Optionally, a vector of unique values of iteration variable iter_var, to which data will be subset.

vc_fun

A function for extracting the variance-covariance matrix of the the parameters estimated by model.

rsq

Logical: Calculate the R-squared?

prog_int

Progress bar interval

Value

Samples from distribution of model parameters.

Author(s)

Devin Caughey

References

Tanner, Martin A. 1996. Tools for Statistical Inference Methods for the Exploration of Posterior Distributions and Likelihood Functions. 3rd ed. New York: Springer.

Treier, Shawn, and Simon Jackman. 2008. "Democracy as a Latent Variable." American Journal of Political Science 52 (1): 201–217.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
### Randomly Generated Data
set.seed(1)
n_obs <- 100
n_samps <- 1000
y0 <- rnorm(n_obs)
x0 <- rnorm(n_obs)
samp.df <- data.frame(iteration=rep(seq_len(n_samps), each=n_obs),
                      x = rep(x0, n_samps) + rnorm(n_obs * n_samps),
                      y = rep(y0, n_samps))
mod_rand <- lm(y ~ x, data = subset(samp.df, iteration == 1))
summary(mod_rand)
moc_rand <- propagate(samp.df, mod_rand)
colMeans(moc_rand) ## point estimate
apply(moc_rand, 2, sd) ## estimated standard error

## Not run: 
if (require(sandwich, quietly = TRUE) && require(dgo, quietly = TRUE)) {
  ### DGO Output 
  dgirt_in_abortion <- shape(opinion, item_names = "abortion",
                             time_name = "year",
                             geo_name = "state", group_names = "race3", 
                             geo_filter = c("CA", "GA", "LA", "MA"),
                             id_vars = "source")
 
  dgmrp_out_abortion <- dgmrp(dgirt_in_abortion, iter = 1500, chains = 4, 
                              cores = 4, seed = 42)
  d <- as.data.frame(dgmrp_out_abortion)
 
  samples = lapply(unique(d$iteration), function(x) {
    poststratify(d[iteration == x],
                 annual_state_race_targets, 
                 strata_names = c("state", "year"),
                 aggregated_names = "race3")
  })
  samples = rbindlist(samples, idcol = "iteration")
 
  ps <- poststratify(
    dgmrp_out_abortion, 
    annual_state_race_targets, 
    strata_names = c("state", "year"), 
    aggregated_names = "race3"
  )
 
  mod_dgmrp <- lm(value ~ 0 + state, ps)  
  summary(mod_dgmrp)
  sqrt(diag(sandwich::vcovHC(mod_dgmrp)))
 
  moc_dgmrp <- propagate(data = samples, model = mod_dgmrp, vc_fun =
    sandwich::vcovHC)
  # point estimate
  colMeans(moc_dgmrp)
  # sd
  apply(moc_dgmrp, 2, sd)
}

## End(Not run)

devincaughey/CaugheyTools documentation built on May 9, 2021, 12:44 p.m.