heaping-package: heaping: Correction of Heaping on Individual Level

heaping-packageR Documentation

heaping: Correction of Heaping on Individual Level

Description

Provides methods for correcting heaping (digit preference) in survey data at the individual record level. Age heaping, where respondents disproportionately report ages ending in 0 or 5, is a common phenomenon that can distort demographic analyses.

Main Functions

correctHeaps

Correct regular age heaping patterns (5-year or 10-year intervals)

correctSingleHeap

Correct a specific single age heap

Methodology

Unlike traditional smoothing methods that only correct aggregated statistics, this package corrects individual values by replacing a calculated proportion of heaped observations with draws from fitted truncated distributions (log-normal, normal, or uniform).

The correction ratio is determined by comparing the count at each heap to the mean of neighboring ages. Observations exceeding this expected ratio are randomly selected and replaced with values drawn from truncated distributions fitted to the original data.

Model-Based Correction

An optional model-based adjustment using random forests can be applied to ensure that corrected values respect relationships with other variables in the dataset. This requires the ranger and VIM packages.

Multiple Imputation

Repeated calls to the correction functions can be used to implement multiple imputation, properly reflecting the uncertainty from the correction process.

Author(s)

Matthias Templ matthias.templ@fhnw.ch

References

Templ, M. (2024). Correction of heaping on individual level. Journal TBD.

Templ, M., Meindl, B., Kowarik, A., Alfons, A., Dupriez, O. (2017). Simulation of Synthetic Populations for Survey Data Considering Auxiliary Information. Journal of Statistical Software, 79(10), 1-38. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v079.i10")}

See Also

Useful links:


heaping documentation built on Feb. 10, 2026, 1:08 a.m.