whatif | R Documentation |
Implements the methods described in King and Zeng (2006a, 2006b) for evaluating counterfactuals.
whatif(formula = NULL, data, cfact, range = NULL, freq = NULL, nearby = 1,
distance = "gower", miss = "list", choice = "both", return.inputs = FALSE,
return.distance = FALSE, mc.cores = detectCores(), ...)
formula |
An optional formula without a dependent variable that
is of class "formula" and that follows standard |
data |
May take one of the following forms:
Missing data is allowed and will be dealt with
via the argument |
cfact |
A |
range |
An optional numeric vector of length |
freq |
An optional numeric vector of any positive length, the elements
of which comprise a set of distances. Used in calculating
cumulative frequency distributions for the distances of the data
points from each counterfactual. For each such distance and
counterfactual, the cumulative frequency is the fraction of observed
covariate data points with distance to the counterfactual less
than or equal to the supplied distance value. The default varies
with the distance measure used. When the Gower distance measure is employed,
frequencies are calculated for the sequence of Gower distances from
0 to 1 in increments of 0.05. When the Euclidian distance measure
is employed, frequencies are calculated for the sequence of Euclidian
distances from the minimum to the maximum observed distances in twenty
equal increments, all rounded to two decimal places. Default is |
nearby |
An optional scalar indicating
which observed data points are considered to be nearby (i.e., withing ‘nearby’
geometric variances of) the counterfactuals. Used to calculate the summary statistic
returned by the function: the fraction of the observed data nearby
each counterfactual. By default, the geometric variance of the
covariate data is used. For example, setting |
distance |
An optional string indicating which of two distance measures
to employ. The choices are either |
miss |
An optional string indicating the strategy for dealing
with missing data in the observed covariate data set.
|
choice |
An optional string indicating which analyses to
undertake. The options are either |
return.inputs |
A Boolean; should the processed observed
covariate and counterfactual data matrices on which all
|
return.distance |
A Boolean; should the matrix of distances
between each counterfactual and data point be returned? If
|
mc.cores |
The number of cores to use for the convex hull test, i.e. at
most how many child processes will be run simultaneously. Must be at least
one, and parallelization requires at least two cores. The default is set by
|
.
... |
Further arguments passed to and from other methods. |
This function is the primary tool for evaluating your counterfactuals. Specifically, it:
Determines whether or not your counterfactuals are in the convex hull of the observed covariate data.
Computes the distance of your counterfactuals from each of the n
observed covariate data points. The default distance function used is Gower's
non-parametric measure.
Computes a summary statistic for each counterfactual based on the distances in (2): the fraction of observed covariate data points with distances to your counterfactual less than a value you supply. By default, this value is taken to be the geometric variability of the observed data.
Computes the cumulative frequency distribution of each counterfactual for the distances in (2) using values that you supply. By default, Gower distances from 0 to 1 in increments of 0.05 are used.
An object of class "whatif", a list consisting of the following six or seven elements:
call |
The original call to |
inputs |
A list with two elements, |
in.hull |
A logical vector of length |
dist |
A |
geom.var |
A scalar. The geometric variability of the observed covariate data. |
sum.stat |
A numeric vector of length |
cum.freq |
A numeric matrix. By default, the matrix has
dimension |
This function requires the lpSolve package.
Stoll, Heather hstoll@polsci.ucsb.edu, King, Gary king@harvard.edu and Zeng, Langche zeng@ucsd.edu
King, Gary and Langche Zeng. 2006. "The Dangers of Extreme Counterfactuals." Political Analysis 14 (2). Available from https://gking.harvard.edu.
King, Gary and Langche Zeng. 2007. "When Can History Be Our Guide? The Pitfalls of Counterfactual Inference." International Studies Quarterly 51 (March). Available from https://gking.harvard.edu.
plot.whatif
,
summary.whatif
,
print.whatif
,
print.summary.whatif
## Create example data sets and counterfactuals
my.cfact <- matrix(rnorm(3*5), ncol = 5)
my.data <- matrix(rnorm(100*5), ncol = 5)
## Evaluate counterfactuals
my.result <- whatif(data = my.data, cfact = my.cfact, mc.cores = 1)
## Evaluate counterfactuals and supply own gower distances for
## cumulative frequency distributions
my.result <- whatif(cfact = my.cfact, data = my.data,
freq = c(0, .25, .5, 1, 1.25, 1.5), mc.cores = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.