View source: R/apply_imputation.R

Apply a function for imputation over rows, columns or combinations of both

1 | ```
apply_imputation(ds, FUN = mean, type = "columnwise", ...)
``` |

`ds` |
A data frame or matrix with missing values. |

`FUN` |
The function to be applied for imputation. |

`type` |
A string specifying the values used for imputation (see details). |

`...` |
Further arguments passed to |

The functionality of `apply_imputation`

is inspired by the
`apply`

function. The function applies a function
`FUN`

to impute the missing values in `ds`

. `FUN`

must be a
function, which takes a vector as input and returns exactly one value. The
argument `type`

is comparable to `apply`

's
`MARGIN`

argument. It specifies the values that are used for the
calculation of the imputation values. For example, `type = "columnwise"`

and `FUN = mean`

will impute the mean of the observed values in a column
for all missing values in this column. In contrast, `type = "rowwise"`

and `FUN = mean`

will impute the mean of the observed values in a row
for all missing values in this row.

List of all implemented `types`

:

"columnwise" (the default): imputes column by column; all observed values of a column are given to

`FUN`

and the returned value is used as the imputation value for all missing values of the column."rowwise": imputes row by row; all observed values of a row are given to

`FUN`

and the returned value is used as the imputation value for all missing values of the row."total": All observed values of

`ds`

are given to`FUN`

and the returned value is used as the imputation value for all missing values of`ds`

."Winer": The mean value from "columnwise" and "rowwise" is used as the imputation value.

"Two-way": The sum of the values from "columnwise" and "rowwise" minus "total" is used as the imputation value.

If no value can be given to `FUN`

(for example, if no value in a column
is observed and `type = "columnwise"`

), then a warning will be issued
and no value will be imputed in the corresponding column or row.

An object of the same class as `ds`

with imputed missing values.

If you use tibbles and an error like â€˜Lossy cast from 'value' double to integerâ€™ occurs, you will first need to convert all integer columns with missing values to double. Another solution is to convert the tibble with as.data.frame() to a data frame. The data frame will automatically convert integer columns to double columns, if needed.

Beland, S., Pichette, F., & Jolani, S. (2016). Impact on
Cronbach's *alpha* of simple treatment methods for missing
data. *The Quantitative Methods for Psychology*, 12(1), 57-73.

A convenient interface exists for common cases like mean imputation:
`impute_mean`

, `impute_median`

,
`impute_mode`

. All these functions
call `apply_imputation`

.

1 2 3 4 5 6 | ```
ds <- data.frame(X = 1:20, Y = 101:120)
ds_mis <- delete_MCAR(ds, 0.2)
ds_imp_app <- apply_imputation(ds_mis, FUN = mean, type = "total")
# the same result can be achieved via impute_mean():
ds_imp_mean <- impute_mean(ds_mis, type = "total")
all.equal(ds_imp_app, ds_imp_mean)
``` |

```
[1] TRUE
```

