| impute_date_ymd | R Documentation |
This function imputes missing **month** and/or **day** components in partial date strings where the **year** is known. It assumes input dates are provided in the *ymd* format (year-month-day) and does not process datetime values or strings containing time components or non-date characters.
impute_date_ymd(
data_frame,
column_name,
separator = "-",
year = "UNKN",
month = "UNK",
day = "UN",
min_max = "min",
suffix = "_DT"
)
data_frame |
data frame |
column_name |
name of column that keeps dates to be imputed |
separator |
by default "-" it is a day-month-year separator, for example "2024-10-21" has "-" separator |
year |
by default "UNKN" - the format of unknown year |
month |
by default "UNK" - the format of unknown month |
day |
by default "UN" - the format of unknown day |
min_max |
by default "min". controlling imputation direction."min" - Impute the earliest possible date "max"' - Impute the latest possible date |
suffix |
by default "_DT" - new imputed date is named as source variable with suffix |
If the **year** is missing or explicitly marked as unknown (e.g., '"UNKN"'), the function returns 'NA'. When the **month** is missing, the function imputes **January (01)** as the default month. When the **day** is missing, it imputes the **first day of the month (01)**.
Any datetime strings (e.g., '"2025-01-NAT11:10:00"') must be preprocessed to remove the time component before applying this function (e.g., convert to '"2025-01-NA"').
In addition to imputing the date, the function creates an accompanying **flag variable** named as: '"<source_variable>_<suffix>F"'. This flag variable indicates the type of imputation performed:
'NA' — No imputation was performed (the original date was complete or missing year).
'"D"' — The **day** component was imputed. The **month** component was imputed.
'"M"' — The **month** component were imputed.
'"D, M"' — Both **month** and **day** components were imputed.
A data frame identical to the input, with an additional column representing the imputed values. The imputed column name is constructed by appending the suffix "_imputed" to the source variable name.
Lukasz Andrzejewski
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.