| heaping-package | R Documentation |
Provides methods for correcting heaping (digit preference) in survey data at the individual record level. Age heaping, where respondents disproportionately report ages ending in 0 or 5, is a common phenomenon that can distort demographic analyses.
correctHeapsCorrect regular age heaping patterns (5-year or 10-year intervals)
correctSingleHeapCorrect a specific single age heap
Unlike traditional smoothing methods that only correct aggregated statistics, this package corrects individual values by replacing a calculated proportion of heaped observations with draws from fitted truncated distributions (log-normal, normal, or uniform).
The correction ratio is determined by comparing the count at each heap to the mean of neighboring ages. Observations exceeding this expected ratio are randomly selected and replaced with values drawn from truncated distributions fitted to the original data.
An optional model-based adjustment using random forests can be applied to ensure that corrected values respect relationships with other variables in the dataset. This requires the ranger and VIM packages.
Repeated calls to the correction functions can be used to implement multiple imputation, properly reflecting the uncertainty from the correction process.
Matthias Templ matthias.templ@fhnw.ch
Templ, M. (2024). Correction of heaping on individual level. Journal TBD.
Templ, M., Meindl, B., Kowarik, A., Alfons, A., Dupriez, O. (2017). Simulation of Synthetic Populations for Survey Data Considering Auxiliary Information. Journal of Statistical Software, 79(10), 1-38. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v079.i10")}
Useful links:
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.