# smbinning.eda: Exploratory Data Analysis (EDA) In smbinning: Scoring Modeling and Optimal Binning

## Description

It shows basic statistics for each characteristic in a data frame. The report includes:

• Field: Field name.

• Type: Factor, numeric, integer, other.

• Recs: Number of records.

• Miss: Number of missing records.

• Min: Minimum value.

• Q25: First quartile. It splits off the lowest 25% of data from the highest 75%.

• Q50: Median or second quartile. It cuts data set in half.

• Avg: Average value.

• Q75: Third quartile. It splits off the lowest 75% of data from the highest 25%.

• Max: Maximum value.

• StDv: Standard deviation of a sample.

• Neg: Number of negative values.

• Pos: Number of positive values.

• OutLo: Number of outliers. Records below `Q25-1.5*IQR`, where `IQR=Q75-Q25`.

• OutHi: Number of outliers. Records above `Q75+1.5*IQR`, where `IQR=Q75-Q25`.

## Usage

 `1` ```smbinning.eda(df, rounding = 3, pbar = 1) ```

## Arguments

 `df` A data frame. `rounding` Optional parameter to define the decimal points shown in the output table. Default is 3. `pbar` Optional parameter that turns on or off a progress bar. Default value is 1.

## Value

The command `smbinning.eda` generates two data frames that list each characteristic with basic statistics such as extreme values and quartiles; and also percentages of missing values and outliers, among others.

## Examples

 ```1 2 3 4 5 6``` ```# Load library and its dataset library(smbinning) # Load package and its data # Example: Exploratory data analysis of dataset smbinning.eda(chileancredit,rounding=3)\$eda # Table with basic statistics smbinning.eda(chileancredit,rounding=3)\$edapct # Table with basic percentages ```

smbinning documentation built on Dec. 1, 2017, 9:02 a.m.