impute: Impute: Filling Missing Values
In certe-medical-epidemiology/certestats: A Certe R Package for Statistical Modelling

impute

R Documentation

Impute: Filling Missing Values

Description

Imputation is the process of replacing missing data with substituted values. This is done because of three main problems that missing data causes: missing data can introduce a substantial amount of bias, make the handling and analysis of the data more arduous, and create reductions in efficiency.

Usage

impute(
  .data,
  vars = everything(),
  algorithm = "mice",
  m = 10,
  method = NULL,
  FUN = median,
  info = TRUE,
  ...
)

is_imputed(.data)

get_mice(.data)

Arguments

`.data`	data set with missing values to impute
`vars`	variables of `.data` that must be imputed, defaults to `everything()` and supports the `tidyselect` language.
`algorithm`	algorithm to use for imputation, must be `"mice"` or `"single-point"`, see Details. For the latter, `FUN` must be given.
`m`	number of multiple imputations if using MICE, see `mice::mice()`. The mean of all imputations will be used as result.
`method`	method to use if using MICE, see `mice::mice()`
`FUN`	function to use for single-point imputation (directly) or for MICE to summarise the results over all `m` iterations
`info`	print info about imputation
`...`	arguments to pass on to `mice::mice()`

Details

Imputation can be done using single-point, such as the mean or the median, or using Multivariate Imputations by Chained Equations (MICE). Using MICE is a lot more reliable, but also a lot slower, than single-point imputation.

The suggested and default method is MICE. The generated MICE object will be stored as an attribute with the data, and can be retrieved with get_mice(), containing all specifics about the imputation. MICE is also known as fully conditional specification and sequential regression multiple imputation. It was designed for data with randomly missing values, though there is simulation evidence to suggest that with a sufficient number of auxiliary variables it can also work on data that are missing not at random.

Use is_imputed() to get a data.frame with TRUEs for all values that were imputed.

Examples

iris2 <- dplyr::as_tibble(iris)
iris2[2, 2] <- NA
iris2[3, 3] <- NA
iris2[4, 5] <- NA
iris
iris2

result <- iris2 |> impute()
result
  
iris2 |> impute(algorithm = "single-point")
iris2 |>
  impute(vars = starts_with("Sepal"),
         algorithm = "single-point")
iris2 |>
  impute(vars = where(is.double),
         algorithm = "single-point",
         FUN = median)
  
result |> is_imputed()
result |> get_mice()

certe-medical-epidemiology/certestats documentation built on Nov. 9, 2024, 8:15 p.m.

certe-medical-epidemiology/certestats index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

certe-medical-epidemiology/certestats
A Certe R Package for Statistical Modelling

impute: Impute: Filling Missing Values
In certe-medical-epidemiology/certestats: A Certe R Package for Statistical Modelling

Impute: Filling Missing Values

Description

Usage

Arguments

Details

Examples

Related to impute in certe-medical-epidemiology/certestats...

R Package Documentation

Browse R Packages

We want your feedback!

certe-medical-epidemiology/certestats A Certe R Package for Statistical Modelling

impute: Impute: Filling Missing Values In certe-medical-epidemiology/certestats: A Certe R Package for Statistical Modelling

Impute: Filling Missing Values

Description

Usage

Arguments

Details

Examples

Related to impute in certe-medical-epidemiology/certestats...

R Package Documentation

Browse R Packages

We want your feedback!

certe-medical-epidemiology/certestats
A Certe R Package for Statistical Modelling

impute: Impute: Filling Missing Values
In certe-medical-epidemiology/certestats: A Certe R Package for Statistical Modelling