impute_df: Impute missing values in a data frame by columns
In DrylandEcology/rSW2utils: Utility Tools for SOILWAT2 and STEPWAT2 Simulation Experiments

impute_df

R Documentation

Impute missing values in a data frame by columns

Description

Impute missing values in a data frame by columns

Usage

impute_df(
  x,
  imputation_type = c("none", "mean", "locf", "interp"),
  imputation_span = 5L,
  cyclic = FALSE,
  nmax_run = Inf
)

Arguments

`x`	A `data.frame` or `matrix` with numerical columns. Imputation works on each column separately.
`imputation_type`	A character string describing the imputation method; currently, one of three values: `"none"`: no imputation is carried out `"mean"`: missing values will be replaced by the average of `imputation_span` non-missing values before and `imputation_span` non-missing values after note: this may fail if there are less than `2 * imputation_span` non-missing values `"locf"`: missing values will be replaced with the "last-observation-carried-forward"' approach `"interp"`: missing values will be replaced by linear interpolation (or extrapolation if at the start or end of a sequence) using the two closest neighbors assuming that rows represent equidistant steps (for each run of missing values separately)
`imputation_span`	An integer value. The number of non-missing values considered if `imputation_type = "mean"`.
`cyclic`	A logical value. If `TRUE`, then the last row of `x` is considered to be a direct neighbor of the first row, e.g., rows of `x` represent day of year for an average year.
`nmax_run`	An integer value. Runs (sets of consecutive missing values) that are equal or shorter to `nmax_run` are imputed; longer runs remain unchanged. Any non-finite value is treated as infinity.

Value

An updated version of x where missing values have been imputed for each column separately.

Examples

n <- 30
ids_missing <- c(1:2, 10:13, 20:22, (n-1):n)
x0 <- x <- data.frame(
  linear = seq_len(n),
  all_missing = NA,
  all_same = 1,
  cyclic = cos(2 * pi * seq_len(n) / n)
)
x[ids_missing, ] <- NA

res <- list()
for (it in c("mean", "locf", "interp")) {
  res[[it]] <- impute_df(x, imputation_type = it, nmax_run = 3L)
  print(cbind(orig = x0[ids_missing, ], res[[it]][ids_missing, ]))
}

if (requireNamespace("graphics")) {
  par_prev <- graphics::par(mfrow = c(ncol(x) - 1L, 1L))
  for (k in seq_len(ncol(x))[-2L]) {
    graphics::plot(
      x[[k]],
      ylim = range(x0[[k]]),
      ylab = colnames(x)[[k]],
      type = "l"
    )
    graphics::points(
      ids_missing,
      x0[ids_missing, k],
      pch = 1L,
      col = 1L
    )
    for (it in seq_along(res)) {
      graphics::points(
        ids_missing,
        res[[it]][ids_missing, k],
        pch = 1L + it,
        col = 1L + it
      )
    }
  }
  graphics::par(par_prev)
}

DrylandEcology/rSW2utils documentation built on Dec. 9, 2023, 10:44 p.m.

DrylandEcology/rSW2utils index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

DrylandEcology/rSW2utils
Utility Tools for SOILWAT2 and STEPWAT2 Simulation Experiments

impute_df: Impute missing values in a data frame by columns
In DrylandEcology/rSW2utils: Utility Tools for SOILWAT2 and STEPWAT2 Simulation Experiments

Impute missing values in a data frame by columns

Description

Usage

Arguments

Value

Examples

Related to impute_df in DrylandEcology/rSW2utils...

R Package Documentation

Browse R Packages

We want your feedback!

DrylandEcology/rSW2utils Utility Tools for SOILWAT2 and STEPWAT2 Simulation Experiments

impute_df: Impute missing values in a data frame by columns In DrylandEcology/rSW2utils: Utility Tools for SOILWAT2 and STEPWAT2 Simulation Experiments

Impute missing values in a data frame by columns

Description

Usage

Arguments

Value

Examples

Related to impute_df in DrylandEcology/rSW2utils...

R Package Documentation

Browse R Packages

We want your feedback!

DrylandEcology/rSW2utils
Utility Tools for SOILWAT2 and STEPWAT2 Simulation Experiments

impute_df: Impute missing values in a data frame by columns
In DrylandEcology/rSW2utils: Utility Tools for SOILWAT2 and STEPWAT2 Simulation Experiments