modelEDA: Exploratory data analysis on a modeling dataset
In dnegrey/miscTools: R package with miscellaneous tools

Description Usage Arguments Value See Also Examples

modelEDA performs standard exploratory data analysis on a modeling dataset, including variable clustering, weight of evidence (binary y), numeric R-squared (continuous y) and univariate summaries/graphs. Parameters prefaced with a function dot (i.e. bin.) are only applicable to the use of that function.

modelEDA(x, yname, ytype, bin.numBins = 10, bin.equalBinSize = FALSE,
  bin.minPct = 0, bin.maxPct = 100, nrsq.na.rm = FALSE,
  variable_cluster.n = max(floor((length(x) - 1)/10), 2),
  variable_cluster.na.rm = TRUE, univariateSummary.FUN = mean,
  univariateGraph.yLabel = yname, univariateGraph.yType = ifelse(ytype == 1,
  "pct", "dlr"), univariateGraph.yDigits = ifelse(ytype == 1, 1, 0),
  univariateGraph.yRangeMode = "tozero",
  univariateGraph.barColor = "#BDDFF7",
  univariateGraph.lineColor = "#000000")

`x`	data frame; a modeling dataset
`yname`	character string; dependent variable column name
`ytype`	integer value of 1 (binary y) or 2 (continuous y)
`bin.numBins`	integer value >= 2; number of desired bins
`bin.equalBinSize`	logical value; return equally sized (TRUE) or equally spaced (FALSE) bins
`bin.minPct`	integer between 0 and 100 specifying a percentile to force as the max endpoint for the low (first) bin
`bin.maxPct`	integer between 0 and 100 specifying a percentile to force as the min endpoint for the high (last) bin (must be > minPct)
`nrsq.na.rm`	logical value indicating whether missing values of x (and their corresponding y values) should be removed
`variable_cluster.n`	integer value >= 2; number of desired clusters
`variable_cluster.na.rm`	logical value; should records with missing values be removed?
`univariateSummary.FUN`	function to be applied to y
`univariateGraph.yLabel`	character string; y variable label
`univariateGraph.yType`	character string; y variable format type; valid values are "int", "dlr" and "pct"
`univariateGraph.yDigits`	non-negative integer value indicating the number of decimal places to show for values of the y variable
`univariateGraph.yRangeMode`	character string; "tozero" (y-axis starts at 0) or "auto" (y-axis extremes determined by data)
`univariateGraph.barColor`	character string; fill color for bars (valid color)
`univariateGraph.lineColor`	character string; line color (valid color)

A named list with class mt_modelEDA containing the following objects:

dataSummaryDT: datatable() object summarizing data
y.relativeHistogram: relativeHistogram() object for y
variable_cluster: data frame with variable_cluster() results
woe (binary y): named list of woe() tables
infoValue (binary y): named list of infoValue() values
woeDT (binary y): named list of woeDT() objects
nrsq (continuous y): named list of nrsq() values
variable_cluster_plus: data frame with variable_cluster_plus() results
clusterDT: named list of clusterDT() objects
univariateSummary: named list of univariateSummary() tables
univariateSummaryDT: named list of univariateSummaryDT() objects
univariateGraph: named list of univariateGraph() objects

woe, nrsq, variable_cluster, univariateGraph

# binary y 
x <- modelEDA(mtcars, "vs", 1)
names(x)
x$woeDT

# continuous y
x <- modelEDA(mtcars, "mpg", 2)
names(x)
x$clusterDT

dnegrey/miscTools documentation built on May 3, 2019, 2:57 p.m.

dnegrey/miscTools index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

dnegrey/miscTools
R package with miscellaneous tools

modelEDA: Exploratory data analysis on a modeling dataset
In dnegrey/miscTools: R package with miscellaneous tools

Description

Usage

Arguments

Value

See Also

Examples

Related to modelEDA in dnegrey/miscTools...

R Package Documentation

Browse R Packages

We want your feedback!

dnegrey/miscTools R package with miscellaneous tools

modelEDA: Exploratory data analysis on a modeling dataset In dnegrey/miscTools: R package with miscellaneous tools

Description

Usage

Arguments

Value

See Also

Examples

Related to modelEDA in dnegrey/miscTools...

R Package Documentation

Browse R Packages

We want your feedback!

dnegrey/miscTools
R package with miscellaneous tools

modelEDA: Exploratory data analysis on a modeling dataset
In dnegrey/miscTools: R package with miscellaneous tools