Description Usage Arguments Details Value Note Author(s) References See Also Examples
Implements the methods described in King and Zeng (2006a, 2006b) for evaluating counterfactuals.
1 2 3 |
formula |
An optional formula without a dependent variable that
is of class "formula" and that follows standard |
data |
May take one of the following forms:
Missing data is allowed and will be dealt with
via the argument |
cfact |
A |
range |
An optional numeric vector of length k, where k is
the number of covariates. Each element represents the range of the corresponding
covariate for use in calculating Gower distances. Use this argument
when covariate data do not represent the population of interest,
such as selection by stratification or experimental manipulation.
By default, the range of each covariate is calculated
from the data (the difference of its maximum and minimum values in
the sample), which is appropriate when a simple random sampling
design was used. To supply your own range for the kth covariate,
set the kth element of the vector equal to the desired range
and all other elements equal to |
freq |
An optional numeric vector of any positive length, the elements
of which comprise a set of distances. Used in calculating
cumulative frequency distributions for the distances of the data
points from each counterfactual. For each such distance and
counterfactual, the cumulative frequency is the fraction of observed
covariate data points with distance to the counterfactual less
than or equal to the supplied distance value. The default varies
with the distance measure used. When the Gower distance measure is employed,
frequencies are calculated for the sequence of Gower distances from
0 to 1 in increments of 0.05. When the Euclidian distance measure
is employed, frequencies are calculated for the sequence of Euclidian
distances from the minimum to the maximum observed distances in twenty
equal increments, all rounded to two decimal places. Default is |
nearby |
An optional scalar indicating
which observed data points are considered to be nearby (i.e., withing ‘nearby’
geometric variances of) the counterfactuals. Used to calculate the summary statistic
returned by the function: the fraction of the observed data nearby
each counterfactual. By default, the geometric variance of the
covariate data is used. For example, setting |
distance |
An optional string indicating which of two distance measures
to employ. The choices are either |
miss |
An optional string indicating the strategy for dealing
with missing data in the observed covariate data set.
|
choice |
An optional string indicating which analyses to
undertake. The options are either |
return.inputs |
A Boolean; should the processed observed
covariate and counterfactual data matrices on which all
|
return.distance |
A Boolean; should the matrix of distances
between each counterfactual and data point be returned? If
|
mc.cores |
The number of cores to use for the convex hull test, i.e. at
most how many child processes will be run simultaneously. Must be at least
one, and parallelization requires at least two cores. The default is set by
|
.
... |
Further arguments passed to and from other methods. |
This function is the primary tool for evaluating your counterfactuals. Specifically, it:
Determines whether or not your counterfactuals are in the convex hull of the observed covariate data.
Computes the distance of your counterfactuals from each of the n observed covariate data points. The default distance function used is Gower's non-parametric measure.
Computes a summary statistic for each counterfactual based on the distances in (2): the fraction of observed covariate data points with distances to your counterfactual less than a value you supply. By default, this value is taken to be the geometric variability of the observed data.
Computes the cumulative frequency distribution of each counterfactual for the distances in (2) using values that you supply. By default, Gower distances from 0 to 1 in increments of 0.05 are used.
An object of class "whatif", a list consisting of the following six or seven elements:
call |
The original call to |
inputs |
A list with two elements, |
in.hull |
A logical vector of length m, where m is the number
of counterfactuals. Each element of the vector is |
dist |
A m-by-n numeric matrix, where m is
the number of counterfactuals and n is the number of data points
(units). Only present if |
geom.var |
A scalar. The geometric variability of the observed covariate data. |
sum.stat |
A numeric vector of length m, where m is the
number of counterfactuals. The mth element contains the summary
statistic for the corresponding counterfactual. This summary statistic is
the fraction of data points with distances to the counterfactual
less than the argument |
cum.freq |
A numeric matrix. By default, the matrix has
dimension m-by-21, where m is the number of
counterfactuals; however, if you supplied your own frequencies via
the argument |
This function requires the lpSolve package.
Stoll, Heather hstoll@polsci.ucsb.edu, King, Gary king@harvard.edu and Zeng, Langche zeng@ucsd.edu
King, Gary and Langche Zeng. 2006. "The Dangers of Extreme Counterfactuals." Political Analysis 14 (2). Available from https://gking.harvard.edu.
King, Gary and Langche Zeng. 2007. "When Can History Be Our Guide? The Pitfalls of Counterfactual Inference." International Studies Quarterly 51 (March). Available from https://gking.harvard.edu.
plot.whatif
,
summary.whatif
,
print.whatif
,
print.summary.whatif
1 2 3 4 5 6 7 8 9 10 11 | ## Create example data sets and counterfactuals
my.cfact <- matrix(rnorm(3*5), ncol = 5)
my.data <- matrix(rnorm(100*5), ncol = 5)
## Evaluate counterfactuals
my.result <- whatif(data = my.data, cfact = my.cfact, mc.cores = 1)
## Evaluate counterfactuals and supply own gower distances for
## cumulative frequency distributions
my.result <- whatif(cfact = my.cfact, data = my.data,
freq = c(0, .25, .5, 1, 1.25, 1.5), mc.cores = 1)
|
Preprocessing data ...
Performing convex hull test ...
|
| | 0%
|
|=================================== | 50%
|
|======================================================================| 100%
Calculating distances ....
Calculating the geometric variance...
Calculating cumulative frequencies ...
Finishing up ...
Preprocessing data ...
Performing convex hull test ...
|
| | 0%
|
|=================================== | 50%
|
|======================================================================| 100%
Calculating distances ....
Calculating the geometric variance...
Calculating cumulative frequencies ...
Finishing up ...
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.