desc_groups: Profiling categorical variable

Description Usage Arguments Value Examples

View source: R/models_lib.R

Description

Calculate the means (or other function) per group to analyze how each segment behave. It scales each variable mean inti the 0 to 1 range to easily profile the groups according to its mean. It also calculate the mean regardless the grouping. This function is also useful when you want to profile cluster results in terms of its means. It automatically adds a row representing the sumarization of the column regardless the group_var categories, this is useful to compare each segement with the whole population. It will exclude all factor/character variables.

Usage

1
desc_groups(data, group_var, group_func = mean, add_all_data_row = T)

Arguments

data

input data source

group_var

variable to make the group by

group_func

the data type of this parameter is a function, not an string, this is the function to be used in the group by, the default value is: mean

add_all_data_row

flag indicating if final data contains the row: 'All_Data', which is the function applied regardless the grouping. Useful to compare with the rest of the values.

Value

grouped data frame

Examples

1
2
3
4
5
6
7
8
# default grouping function: mean
desc_groups(data=mtcars, group_var="cyl")

# using the median as the grouping function
desc_groups(data=mtcars, group_var="cyl", group_func=median)

# using the max as the grouping function
desc_groups(data=mtcars, group_var="gear", group_func=max)

Example output

Loading required package: Hmisc
Loading required package: lattice
Loading required package: survival
Loading required package: Formula
Loading required package: ggplot2

Attaching package: 'Hmisc'

The following objects are masked from 'package:base':

    format.pval, round.POSIXt, trunc.POSIXt, units

sh: 1: cannot create /dev/null: Permission denied
funModeling v.1.6.5 :)
Examples and tutorials at livebook.datascienceheroes.com

`summarise_each()` is deprecated.
Use `summarise_all()`, `summarise_at()` or `summarise_if()` instead.
To map `funs` over a selection of variables, use `summarise_at()`
`mutate_each()` is deprecated.
Use `mutate_all()`, `mutate_at()` or `mutate_if()` instead.
To map `funs` over a selection of variables, use `mutate_at()`
`summarise_each()` is deprecated.
Use `summarise_all()`, `summarise_at()` or `summarise_if()` instead.
To map `funs` over all variables, use `summarise_all()`
       cyl      mpg     disp       hp     drat      wt     qsec     vs      am
1        4 26.66000 105.1400  82.6400 4.070000 2.29000 19.14000 0.9100 0.73000
2        6 19.74000 183.3100 122.2900 3.590000 3.12000 17.98000 0.5700 0.43000
3        8 15.10000 353.1000 209.2100 3.230000 4.00000 16.77000 0.0000 0.14000
4 All_Data 20.09062 230.7219 146.6875 3.596563 3.21725 17.84875 0.4375 0.40625
    gear   carb
1 4.0900 1.5500
2 3.8600 3.4300
3 3.2900 3.5000
4 3.6875 2.8125
`summarise_each()` is deprecated.
Use `summarise_all()`, `summarise_at()` or `summarise_if()` instead.
To map `funs` over a selection of variables, use `summarise_at()`
`mutate_each()` is deprecated.
Use `mutate_all()`, `mutate_at()` or `mutate_if()` instead.
To map `funs` over a selection of variables, use `mutate_at()`
`summarise_each()` is deprecated.
Use `summarise_all()`, `summarise_at()` or `summarise_if()` instead.
To map `funs` over all variables, use `summarise_all()`
       cyl  mpg  disp    hp  drat    wt  qsec vs am gear carb
1        4 26.0 108.0  91.0 4.080 2.200 18.90  1  1    4  2.0
2        6 19.7 167.6 110.0 3.900 3.210 18.30  1  0    4  4.0
3        8 15.2 350.5 192.5 3.120 3.750 17.18  0  0    3  3.5
4 All_Data 19.2 196.3 123.0 3.695 3.325 17.71  0  0    4  2.0
`summarise_each()` is deprecated.
Use `summarise_all()`, `summarise_at()` or `summarise_if()` instead.
To map `funs` over a selection of variables, use `summarise_at()`
`mutate_each()` is deprecated.
Use `mutate_all()`, `mutate_at()` or `mutate_if()` instead.
To map `funs` over a selection of variables, use `mutate_at()`
`summarise_each()` is deprecated.
Use `summarise_all()`, `summarise_at()` or `summarise_if()` instead.
To map `funs` over all variables, use `summarise_all()`
      gear  mpg cyl  disp  hp drat    wt  qsec vs am carb
1        3 21.5   8 472.0 245 3.73 5.420 20.22  1  0    4
2        4 33.9   6 167.6 123 4.93 3.440 22.90  1  1    4
3        5 30.4   8 351.0 335 4.43 3.570 16.90  1  1    8
4 All_Data 33.9   8 472.0 335 4.93 5.424 22.90  1  1    8

funModeling documentation built on July 1, 2020, 5:40 p.m.