Exploratory Data Analysis Report"

library(rmarkdown)
library(SmartEDA)
library(knitr)
library(scales)
library(gridExtra)
library(ggplot2)

data <- params$data

Exploratory Data analysis (EDA)

Analyzing the data sets to summarize their main characteristics of variables, often with visual graphs, without using a statistical model.

1. Overview of the data

Understanding the dimensions of the data set, variable names, overall missing summary and data types of each variables

# Overview of the data
ExpData(data=data,type=1)
# Structure of the data
ExpData(data=data,type=2)
ovw_tabl <- ExpData(data=data,type=1)
ovw_tab2 <- ExpData(data=data,type=2)

Overview of the data

paged_table(ovw_tabl)

Structure of the data

paged_table(ovw_tab2)

2. Summary of numerical variables

Summary of all numeric variables

snv_2 = ExpNumStat(data,by="A",gp=NULL,Qnt=seq(0,1,0.1),MesofShape=2,Outlier=TRUE,round=2)
rownames(snv_2)<-NULL
ExpNumStat(data,by="A",gp=NULL,Qnt=seq(0,1,0.1),MesofShape=2,Outlier=TRUE,round=2)
paged_table(snv_2)

3. Distributions of numerical variables

Graphical representation of all numeric features

ExpOutQQ(data,nlim=4,fname=NULL,Page=c(2,2),sample=sn)
ExpNumViz(data,target=NULL,type=1,nlim=10,fname=NULL,col=NULL,Page=c(2,2),theme=theme,sample=sn)
ExpNumViz(data,Page=c(2,1),sample=sn,theme=theme,scatter=TRUE)

4. Summary of categorical variables

Summary of categorical variables

et1 <- ExpCTable(data,Target=NULL,margin=1,clim=10,nlim=5,round=2,bin=NULL,per=T)
rownames(et1)<-NULL
ExpCTable(data,Target=NULL,margin=1,clim=10,nlim=5,round=2,bin=NULL,per=T)
if(length(et1) == 5) {
  paged_table(et1)} else {
  print("Input data does'nt have any categorical columns to generate custom tables")
}

NA is Not Applicable

5. Distributions of categorical variables

Bar plots for all categorical variables

Bar plot with vertical or horizontal bars for all categorical variables

test = nrow(ovw_tab2[ovw_tab2$No_of_distinct_values < 11,])
if(test > 0)  ExpCatViz(data,target=NULL,fname=NULL,clim=10,margin=2,theme=theme,Page = c(2,2),sample=sc)


Try the SmartEDA package in your browser

Any scripts or data that you put into this service are public.

SmartEDA documentation built on Dec. 4, 2022, 1:15 a.m.