View source: R/fn_exp_numeric.R
ExpNumStat | R Documentation |
Function provides summary statistics for all numerical variable. This function automatically scans through each variable and select only numeric/integer variables. Also if we know the target variable, function will generate relationship between target variable and each independent variable.
ExpNumStat(
data,
by = "A",
gp = NULL,
Qnt = NULL,
Nlim = 10,
MesofShape = 2,
Outlier = FALSE,
round = 3,
weight = NULL,
dcast = FALSE,
val = NULL
)
data |
dataframe or matrix |
by |
group by A (summary statistics by All), G (summary statistics by group), GA (summary statistics by group and Overall) |
gp |
target variable if any, default NULL |
Qnt |
default NULL. Specified quantile is c(.25,0.75) will find 25th and 75th percentiles |
Nlim |
numeric variable limit (default value is 3 which means it will only consider those variable having more than 3 unique values and variable type is numeric/integer) |
MesofShape |
Measures of shapes (Skewness and kurtosis). |
Outlier |
Calculate the lower hinge, upper hinge and number of outlier |
round |
round off |
weight |
a vector of weights, it must be equal to the length of data |
dcast |
fast dcast from data.table |
val |
Name of the column whose values will be filled to cast (see Details sections for list of column names) |
column descriptions
Vname
is Variable name
Group
is Target variable
TN
is Total sample (included NA observations)
nNeg
is Total negative observations
nPos
is Total positive observations
nZero
is Total zero observations
NegInf
is Negative infinite count
PosInf
is Positive infinite count
NA_value
is Not Applicable count
Per_of_Missing
is Percentage of missing
Min
is minimum value
Max
is maximum value
Mean
is average value
Median
is median value
SD
is Standard deviation
CV
is coefficient of variations (SD/mean)*100
IQR
is Inter quartile range
Qnt
is quantile values
MesofShape
is Skewness and Kurtosis
Outlier
is Number of outlier
Cor
is Correlation b/w target and independent variables
summary statistics for numeric independent variables
Summary by:
Only overall level
Only group level
Both overall and group level
describe.by
# Descriptive summary of numeric variables is Summary by Target variables
ExpNumStat(mtcars,by="G",gp="gear",Qnt=c(0.1,0.2),MesofShape=2,
Outlier=TRUE,round=3)
# Descriptive summary of numeric variables is Summary by Overall
ExpNumStat(mtcars,by="A",gp="gear",Qnt=c(0.1,0.2),MesofShape=2,
Outlier=TRUE,round=3)
# Descriptive summary of numeric variables is Summary by Overall and Group
ExpNumStat(mtcars,by="GA",gp="gear",Qnt=seq(0,1,.1),MesofShape=1,
Outlier=TRUE,round=2)
# Summary by specific statistics for all numeric variables
ExpNumStat(mtcars,by="GA",gp="gear",Qnt=c(0.1,0.2),MesofShape=2,
Outlier=FALSE,round=2,dcast = TRUE,val = "IQR")
# Weighted summary statistics
ExpNumStat(mtcars,by="GA",gp="gear",Qnt=c(0.1,0.2),MesofShape=2,
Outlier=FALSE,round=2,dcast = TRUE,val = "IQR", weight = "wt")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.