library(rmarkdown) library(SmartEDA) library(knitr) library(scales) library(gridExtra) library(ggplot2) data <- params$data
Analyzing the data sets to summarize their main characteristics of variables, often with visual graphs, without using a statistical model.
Understanding the dimensions of the dataset, variable names, overall missing summary and data types of each variables
# Overview of the data ExpData(data=data,type=1) # Structure of the data ExpData(data=data,type=2)
ovw_tabl <- ExpData(data=data,type=1) ovw_tab2 <- ExpData(data=data,type=2)
Overview of the data
paged_table(ovw_tabl)
Structure of the data
paged_table(ovw_tab2)
Target variable
Summary of continuous dependent variable
r Target
r label
summary(data[,Target])
snv_2 = ExpNumStat(data,by="GA",gp=Target,Qnt=seq(0,1,0.1),MesofShape=2,Outlier=TRUE,round=2) rownames(snv_2)<-NULL
Summary statistics when dependent variable is Continuous r Target
.
ExpNumStat(data,by="A",gp=Target,Qnt=seq(0,1,0.1),MesofShape=2,Outlier=TRUE,round=2)
paged_table(snv_2)
Graphical representation of all numeric features, used below types of plots to explore the data
Quantile-quantile plot for all Numerical variables
ExpOutQQ(data,nlim=4,fname=NULL,Page=c(2,2),sample=sn)
Density plot for all numerical variables
ExpNumViz(data,target=NULL,nlim=10,fname=NULL,col=NULL,theme=theme,Page=c(2,2),sample=sn)
Scatter plot between all numeric variables and target variable r Target
.
This plot help to examine how well a target variable is correlated with list of dependent variables in the data set.
ExpNumViz(data,target=NULL,nlim=5,Page=c(2,1),theme=theme,sample=sn,scatter=TRUE)
Dependent variable is r Target
(continuous).
ExpNumViz(data,target=Target,nlim=5,fname=NULL,col=NULL,theme=theme,Page=c(2,2),sample=sn)
** Correlation summary table
snv_22 = ExpNumStat(data,by="GA",gp=Target,MesofShape=2,Outlier=FALSE,round=2,dcast=T,val="cor") rownames(snv_22)<-NULL
ExpNumStat(data,by="GA",gp=Target,MesofShape=2,Outlier=FALSE,round=2,dcast=T,val="cor")
paged_table(snv_22)
Summary of categorical variables
et1 <- ExpCTable(data,Target=NULL,margin=1,clim=10,nlim=5,round=2,per=T) rownames(et1)<-NULL
et11 <- ExpCTable(data,Target=Target,margin=1,clim=10,nlim=5,round=2,bin=4,per=T) rownames(et11)<-NULL
ExpCTable(data,margin=1,clim=10,nlim=5,round=2,per=T)
paged_table(et1)
r Target
##bin=4, descretized 4 categories based on quantiles ExpCTable(data,Target=Target,margin=1,clim=10,nlim=5,round=2,bin=4,per=T)
paged_table(et11)
Graphical representation of all Categorical variables
Bar plot with vertical or horizontal bars for all categorical variables
ExpCatViz(data,clim=10,margin=2,theme=theme,Page = c(2,2),sample=sc)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.