Description Usage Arguments Value Author(s) Examples
Automatically performs LDE.Explore() and then LDE.UsefulVars(), finally returns the transformed dataset, excluding the unuseful variables, and the $statistics $var.status and $var.classif of LDE.UsefulVars()
1 | LDE.AutoProcess(dat, maxNARate = NULL, keyNamesMatch = NULL)
|
dat |
data.frame |
maxNARate |
numeric vector 0-1. Variables with a higher rate of NAs will be excluded. Null to ignore |
keyNamesMatch |
string vector containing substrings to search at the start or end of each variable name to classify it as a key. Null to ignore |
The filtered dataset, with re-formatted variables and all the process information including descriptive statistics
Daniel Nieto
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | df <- data.frame(secop1.full)
maxNARate <- 0.2
keyNamesMatch<-c("ID","KEY")
#Basic AutoProcess of the data.frame
Auto.1.df <- LDE.AutoProcess(df)
#Using a Max Rate of NA per variable
Auto.2.df <- LDE.AutoProcess(df,NULL,maxNARate)
#Using substrings to identify variables names as keys, but without NAs filtering
Auto.3.df <- LDE.AutoProcess(df,keyNamesMatch)
#Using Max Rate of NAs and substrings to identify variables names as keys
Auto.4.df <- LDE.AutoProcess(df,keyNamesMatch,maxNARate)
#Obtention of the cleaned dataset
df.clean<-Auto.4.df$df.filtered
#See if variables were included or excluded
#View(Auto.4.df$var.status$included) #included vars
#View(Auto.4.df$var.status$excluded) #excluded vars
#See how the included variables were classified E.g.:
#View(Auto.4.df$var.classif$included.vars$df.num) #numeric vars
#View(Auto.4.df$var.classif$included.vars$df.bool) #boolean vas
#See how the excluded variables were classified E.g.:
#View(Auto.4.df$var.classif$removed.vars$not.useful) #excluded by type
#View(Auto.4.df$var.classif$removed.vars$filtered.byNAs) #excluded by NA rate
#View(Auto.4.df$var.classif$removed.vars$not.useful$df.NA) #excluded by type, empty
#View(Auto.4.df$var.classif$removed.vars$filtered.byNAs$df.num) #numeric excluded by NA rate
#See statistics of variables by exclusion reason
#View(Auto.4.df$statistics$useful.vars) #included
#View(Auto.4.df$statistics$filteredbyNAs.vars) #excluded by NAs
#View(Auto.4.df$statistics$unuseful.vars) #excluded by type
#See statistics of variables by exclusion reason and type E.g.:
#View(Auto.4.df$statistics$useful.vars$df.levels) #included that were levels
#View(Auto.4.df$statistics$filteredbyNAs.vars$df.num) #numeric, excluded by NA rate
#View(Auto.4.df$statistics$unuseful.vars$df.NA) #excluded by type, empty vars
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.