Description Usage Arguments Value Author(s) Examples
The function gives univariate analysis of the variables as output dataframe. The univariate statistics includes - minimum, maximum, mean, median, number of distinct values, variable type, counts of null value, percentage of null value, maximum population percentage among all classes/values, correlation with target. It also returns the list of names of character and numerical variable types along with variable name with population concentration more than a threshold at a class/value.
1 | univariate(base, target, threshold)
|
base |
input dataframe |
target |
column / field name for the target variable to be passed as string (must be 0/1 type) |
threshold |
sparsity threshold, to be provided as decimal/fraction |
The function returns an object of class "univariate" which is a list containing the following components:
univar_table |
univariate summary of variables |
num_var_name |
array of column names of numerical type variables |
char_var_name |
array of column names of categorical type variables |
sparse_var_name |
array of column names where population concentration at a class or value is more then the sparsity threshold |
Arya Poddar <aryapoddar290990@gmail.com>
1 2 3 4 5 6 7 8 9 |
var var_min var_max mean median var_vals type count_missing
1 Sepal.Length 4.3 7.9 5.843333 5.80 35 numeric 0
2 Sepal.Width 2.0 4.4 3.057333 3.00 23 numeric 0
3 Petal.Length 1.0 6.9 3.758000 4.35 43 numeric 0
4 Petal.Width 0.1 2.5 1.199333 1.30 22 numeric 0
5 Species NA NA NA NA 3 character 0
perc_missing max_pop_conc corr
1 0 0.06666667 0.00124037
2 0 0.17333333 0.02061405
3 0 0.08666667 -0.04615706
4 0 0.19333333 -0.06762063
5 0 0.33333333 NA
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
[1] "Species"
character(0)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.