coda_replacement: Replacement of Missing Values and Below-Detection Zeros in...

View source: R/zero_replacement_imputation.R

coda_replacementR Documentation

Replacement of Missing Values and Below-Detection Zeros in Compositional Data

Description

Performs imputation (replacement) of missing values and/or values below the detection limit (BDL) in compositional datasets using the EM-algorithm assuming normality on the Simplex. This function is designed to prepare compositional data for subsequent log-ratio transformations.

Usage

coda_replacement(
  X,
  DL = NULL,
  dl_prop = 0.65,
  eps = 1e-04,
  parameters = FALSE,
  debug = FALSE
)

Arguments

X

A compositional dataset: numeric matrix or data frame where rows represent observations and columns represent parts.

DL

An optional matrix or vector of detection limits. If NULL, the minimum non-zero value in each column of X is used.

dl_prop

A numeric value between 0 and 1, used for initialization in the EM algorithm (default is 0.65).

eps

A small positive value controlling the convergence criterion for the EM algorithm (default is 1e-4).

parameters

Logical. If TRUE, returns additional output including estimated multivariate normal parameters (default is FALSE).

debug

Logical. Show the log-likelihood in every iteration.

Details

- Missing values are imputed based on a multivariate normal model on the simplex. - Zeros are treated as censored values and replaced accordingly. - The EM algorithm iteratively estimates the missing parts and model parameters. - To initialize the EM algorithm, zero values (considered below the detection limit) are replaced with a small positive value. Specifically, each zero is replaced by dl_prop times the detection limit of that part (column). This restrictions is imposed in the geometric mean of the parts with zeros against the non-missing positive values, helping to preserve the compositional structure in the simplex.

Value

If parameters = FALSE, returns a numeric matrix with imputed values. If parameters = TRUE, returns a list with two components:

X_imp

The imputed compositional data matrix.

info

A list containing information about the EM algorithm parameters and convergence diagnostics.

Examples

# Simulate compositional data with zeros
set.seed(123)
X <- abs(matrix(rnorm(100), ncol = 5))
X[sample(length(X), 10)] <- 0  # Introduce some zeros
X[sample(length(X), 10)] <- NA  # Introduce some NAs
# Apply replacement
summary(X/rowSums(X, na.rm=TRUE))
summary(coda_replacement(X))


coda.base documentation built on July 3, 2025, 1:09 a.m.