hot.deck-package: Multiple Hot Deck Imputation

Description Details Author(s) References

Description

This package contains all of the functions necessary to perform multiple hot deck imputation on an input data frame with missing observations using either the “best cell” method (default) or the “probabilistic draw” method as described in Cranmer and Gill (2013). This technique is best suited for missingness in discrete variables, though it also works well for continuous missing observations.

Details

Package: hot.deck
Type: Package
Version: 1.0
Date: 2014-09-03
License: What license is it under?

In multiple hot deck imputation, several observed values of the variable with missing observations are drawn conditional on the rest of the data and are used to impute each missing value. The advantage of this class of methods over multiple imputation is that the imputed values are actually draws from the observed data. As such, when discrete variables are imputed with a hot deck method, their discrete properties are maintained.

Two methods for weighting the imputations are provided in this package. The “best cell” [called as “best.cell”] technique uses the degree of affinity between the row with missing data and each potential donor row to generate weights such that rows more closely resembling the row with missingness are more likely to be drawn as donors. The probabilistic draw method is the default method. The “probabilistic draw” [called as “p.draw”] technique is also available. The best cell method draws randomly from the cell of best matches to the row with a missing observation.

Author(s)

Skyler Cranmer, Jeff Gill, Natalie Jackson, Andreas Murr and Dave Armstrong Maintainer: Dave Armstrong <dave@quantoid.net>

References

Cranmer, S.J. and Gill, J.M.. (2013) “We Have to Be Discrete About This: A Non-Parametric Imputation Technique for Missing Categorical Data.” British Journal of Political Science 43:2 (425-449).


davidaarmstrong/hot.deck documentation built on April 2, 2020, 4:52 a.m.