Exploratory Data Analysis (EDA)

Share:

Description

It shows basic statistics for each numeric, integer, and factor characteristic in a data frame.

Usage

1
smbinning.eda(df, rounding = 3, pbar = 1)

Arguments

df

A data frame.

rounding

Optional parameter to define the decimal points shown in the output table. Default is 3.

pbar

Optional parameter that turns on or off a progress bar. Default value is 1 (On).

Value

The command smbinning.eda generates two data frames that list each characteristic with basic statistics such as extreme values and quartiles; and also percentages of missing values and outliers, among others.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Package loading and data exploration
library(smbinning) # Load package and its data
data(chileancredit) # Load smbinning sample dataset (Chilean Credit)
 
# Training and testing samples (Just some basic formality for Modeling) 
chileancredit.train=subset(chileancredit,FlagSample==1)
chileancredit.test=subset(chileancredit,FlagSample==0)
 
# EDA application
smbinning.eda(chileancredit.train,rounding=3)$eda # Table with basic statistics.
smbinning.eda(chileancredit.train,rounding=3)$edapct # Table with basic percentages.