View source: R/clean_numeric.R
clean_numeric | R Documentation |
Applies a dictionary of value-replacement pairs and a conversion function
(defaults to as.numeric
) to clean and standardize values of numeric
variables. To use this approach the numeric columns of the original dataset
should generally be imported as type "text" or "character" so that non-valid
values are not automatically coerced to missing values on import.
clean_numeric(
x,
vars,
vars_id = NULL,
dict_clean = NULL,
fn = as.numeric,
na = ".na"
)
x |
A data frame with one or more columns to clean |
vars |
Names of columns within |
vars_id |
Optional vector of one or more ID columns within If not specified the cleaning dictionary contains one entry for each unique combination of variable and non-valid value. If specified the cleaning dictionary contains one entry for each unique combination of variable, non-valid value, and ID variable. |
dict_clean |
Optional dictionary of value-replacement pairs (e.g.
produced by If no dictionary is provided, will simply apply the conversion function to
all columns specified in |
fn |
Function to convert values to numeric. Defaults to |
na |
Keyword to use within column "replacement" for values that should
be converted to |
The original data frame x
but with cleaned versions of columns vars
# load example dataset and dictionary of value-replacement pairs
data(ll1)
data(clean_num1)
# dictionary-based corrections to numeric vars 'age' and 'contacts'
clean_numeric(
ll1,
vars = c("age", "contacts"),
dict_clean = clean_num1
)
# apply standardization with as.integer() rather than default as.numeric()
clean_numeric(
ll1,
vars = c("age", "contacts"),
dict_clean = clean_num1,
fn = as.integer
)
# apply standardization but no dictionary-based cleaning
clean_numeric(
ll1,
vars = c("age", "contacts")
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.