improve_kmeans_labels_variable: improve_kmeans_labels_variable

Description Usage Arguments Details Value Note Author(s)

View source: R/stats.R

Description

Improve the dataset labels based on a balanced grouped sum of a particular variable.

Usage

1
2
3
4
5
6
7
improve_kmeans_labels_variable(
  df,
  id,
  label,
  var_model,
  split_type = c("mean_split", "range_split")
)

Arguments

df

dataset to change labels.

id

dataset id variable

label

k-means label variable

var_model

reference variable to balance

split_type

type of label modification.

Details

  1. Get the grouped sum of var_model by the created k-mean label variable.

  2. Calculate the population mean and standard deviation from 1. as parameters to modify the label

  3. create the value_check as the constant values as comparation reference where the user will select based on the parameter split_type

  4. With help of the function split_lower_upper_df we obtain the labels under and upper the constant values on 3.

  5. Apply the subfunction modify_labels_variable_df

Value

This function modify the label variable of the df based on split_type:

Note

This function is used to improve the k-means labels based on a particular variable.

Author(s)

Eduardo Trujillo


1Edtrujillo1/udeploy documentation built on July 13, 2021, 9:12 p.m.