Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/mice.impute.pmm.R

Imputation by predictive mean matching

1 2 3 4 5 6 7 8 9 10 11 | ```
mice.impute.pmm(
y,
ry,
x,
wy = NULL,
donors = 5L,
matchtype = 1L,
ridge = 1e-05,
use.matcher = FALSE,
...
)
``` |

`y` |
Vector to be imputed |

`ry` |
Logical vector of length |

`x` |
Numeric design matrix with |

`wy` |
Logical vector of length |

`donors` |
The size of the donor pool among which a draw is made.
The default is |

`matchtype` |
Type of matching distance. The default choice
( |

`ridge` |
The ridge penalty used in |

`use.matcher` |
Logical. Set |

`...` |
Other named arguments. |

Imputation of `y`

by predictive mean matching, based on
van Buuren (2012, p. 73). The procedure is as follows:

Calculate the cross-product matrix

*S=X_{obs}'X_{obs}*.Calculate

*V = (S+{diag}(S)κ)^{-1}*, with some small ridge parameter*κ*.Calculate regression weights

*\hatβ = VX_{obs}'y_{obs}.*Draw

*q*independent*N(0,1)*variates in vector*\dot z_1*.Calculate

*V^{1/2}*by Cholesky decomposition.Calculate

*\dotβ = \hatβ + \dotσ\dot z_1 V^{1/2}*.Calculate

*\dotη(i,j)=|X_{{obs},[i]|}\hatβ-X_{{mis},[j]}\dotβ*with*i=1,…,n_1*and*j=1,…,n_0*.Construct

*n_0*sets*Z_j*, each containing*d*candidate donors, from Y_obs such that*∑_d\dotη(i,j)*is minimum for all*j=1,…,n_0*. Break ties randomly.Draw one donor

*i_j*from*Z_j*randomly for*j=1,…,n_0*.Calculate imputations

*\dot y_j = y_{i_j}*for*j=1,…,n_0*.

The name *predictive mean matching* was proposed by Little (1988).

Vector with imputed data, same type as `y`

, and of length
`sum(wy)`

Stef van Buuren, Karin Groothuis-Oudshoorn

Little, R.J.A. (1988), Missing data adjustments in large surveys (with discussion), Journal of Business Economics and Statistics, 6, 287–301.

Morris TP, White IR, Royston P (2015). Tuning multiple imputation by predictive mean matching and local residual draws. BMC Med Res Methodol. ;14:75.

Van Buuren, S. (2018).
*Flexible Imputation of Missing Data. Second Edition.*
Chapman & Hall/CRC. Boca Raton, FL.

Van Buuren, S., Groothuis-Oudshoorn, K. (2011). `mice`

: Multivariate
Imputation by Chained Equations in `R`

. *Journal of Statistical
Software*, **45**(3), 1-67. https://www.jstatsoft.org/v45/i03/

Other univariate imputation functions:
`mice.impute.cart()`

,
`mice.impute.lda()`

,
`mice.impute.logreg.boot()`

,
`mice.impute.logreg()`

,
`mice.impute.mean()`

,
`mice.impute.midastouch()`

,
`mice.impute.mnar.logreg()`

,
`mice.impute.norm.boot()`

,
`mice.impute.norm.nob()`

,
`mice.impute.norm.predict()`

,
`mice.impute.norm()`

,
`mice.impute.polr()`

,
`mice.impute.polyreg()`

,
`mice.impute.quadratic()`

,
`mice.impute.rf()`

,
`mice.impute.ri()`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | ```
# We normally call mice.impute.pmm() from within mice()
# But we may call it directly as follows (not recommended)
set.seed(53177)
xname <- c("age", "hgt", "wgt")
r <- stats::complete.cases(boys[, xname])
x <- boys[r, xname]
y <- boys[r, "tv"]
ry <- !is.na(y)
table(ry)
# percentage of missing data in tv
sum(!ry) / length(ry)
# Impute missing tv data
yimp <- mice.impute.pmm(y, ry, x)
length(yimp)
hist(yimp, xlab = "Imputed missing tv")
# Impute all tv data
yimp <- mice.impute.pmm(y, ry, x, wy = rep(TRUE, length(y)))
length(yimp)
hist(yimp, xlab = 'Imputed missing and observed tv')
plot(jitter(y), jitter(yimp),
main = 'Predictive mean matching on age, height and weight',
xlab = 'Observed tv (n = 224)',
ylab = 'Imputed tv (n = 224)')
abline(0, 1)
cor(y, yimp, use = 'pair')
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.