View source: R/clean_numeric.R
| clean_numeric | R Documentation |
Applies a dictionary of value-replacement pairs and a conversion function
(defaults to as.numeric) to clean and standardize values of numeric
variables. To use this approach the numeric columns of the original dataset
should generally be imported as type "text" or "character" so that non-valid
values are not automatically coerced to missing values on import.
clean_numeric(
x,
vars,
vars_id = NULL,
dict_clean = NULL,
fn = as.numeric,
na = ".na"
)
x |
A data frame with one or more columns to clean |
vars |
Names of columns within |
vars_id |
Optional vector of one or more ID columns within If not specified the cleaning dictionary contains one entry for each unique combination of variable and non-valid value. If specified the cleaning dictionary contains one entry for each unique combination of variable, non-valid value, and ID variable. |
dict_clean |
Optional dictionary of value-replacement pairs (e.g.
produced by If no dictionary is provided, will simply apply the conversion function to
all columns specified in |
fn |
Function to convert values to numeric. Defaults to |
na |
Keyword to use within column "replacement" for values that should
be converted to |
The original data frame x but with cleaned versions of columns vars
# load example dataset and dictionary of value-replacement pairs
data(ll1)
data(clean_num1)
# dictionary-based corrections to numeric vars 'age' and 'contacts'
clean_numeric(
ll1,
vars = c("age", "contacts"),
dict_clean = clean_num1
)
# apply standardization with as.integer() rather than default as.numeric()
clean_numeric(
ll1,
vars = c("age", "contacts"),
dict_clean = clean_num1,
fn = as.integer
)
# apply standardization but no dictionary-based cleaning
clean_numeric(
ll1,
vars = c("age", "contacts")
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.