replace.missing.df | R Documentation |
To simple replace missing data without changing column means. This will also use criteria to decide whether each column is numeric, so that illegal operations aren't performed on strings, etc. Also adjusting the 'error' parameter allows adding variance to the missing observations to help to reduce bias associated with inserting many of the same replacement value.
replace.missing.df( X, repl.fun = mean, error = 0, thresh = 0.9, digits = 99, force = FALSE )
X |
a data.frame to replace missing values in |
repl.fun |
the function to perform the replacement. Default is 'mean'. A replacement should take a vector 'x' and produce a single scalar as a result. |
error |
default value is 0, meaning replacements will be all the same value for each column of the data.frame X. If you give a positive value, this amount of gaussian noise (in StDev units of the original variable) will be added to the replacement values. |
thresh |
passed to function 'is.vec.numeric', see explanation there. |
digits |
Trim replacement values to this many digits |
force |
TRUE means replace missing for all columns with testing for numeric |
returns a data.frame with the same dimensions with missing values for numeric values imputed using the repl.fun function, optionally with noise added.
Nicholas Cooper
df <- data.frame(first=c(1,2,NA,4,5), second=paste(c(6,7,8,NA,10)), third=c("jake", "fred", "cathy", "sandra", "mike")) df replace.missing.df(df) replace.missing.df(df, force=TRUE) df2 <- data.frame(first=c(1:5, NA, NA, NA,9, 10), second=paste(c(NA, NA, 6:10, "5|6", "7|8", 1)), third=rep(c("jake", "fred", "cathy", "sandra", "mike"),2)) df2 replace.missing.df(df2) replace.missing.df(df2, thresh=0.7) replace.missing.df(df2, error = 1, thresh=0.7, digits=4)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.