improve_kmeans_labels: improve_kmeans_labels
In 1Edtrujillo1/udeploy: Back-end & Front-end functions

Optimize generated K-labels for desagregated dataset

1	improve_kmeans_labels(df, id, label, k)

`df`	dataset to change labels.
`id`	dataset id variable reference of balance
`label`	k-means label variable
`k`	number of desire clusters

split the dataset by id testing unique elements on label. Spliting on unique and duplicated sublist of the list improve_kmeans_labels called df_splited.
For the duplicated sublist, we split it by the label testing unique element on label in unique elements and duplicated elements.

For the duplicated elements, then those are going to be the duplicated sublist of 1.
For the unique elements, then those are going to be appended to the unique sublist 1. on each same specific sublist of the unique sublist 1.
Now we have a correct sublist of unique and duplicated elements.

3.From the duplicated sublist we take the first row of each sublist called to_modify and from the unique sublist we take a random sample of the same length from the duplicated one called uniq_modify. From that sublist we create a sublist of the k-mean labels called uniq_labels.
4.We modify to_modify based on the list of labels uniq_labels obtained from the sublist uniq_modify where if the label of to_modify is in uniq_labels then take a random number between 1 to k except that labels of uniq_labels. In other case take any label from the sublist of the sublist uniq_labels.
5.The modify sublist to_modify is going to be append in the list of samples uniq_modify in each sublist
6.We modify the original created list df_splited modifying the unique sublist elements with uniq_modify sublist and modify the duplicated sublist deleting the first row of each sublist since was used on 3.
7.Create the original dataset with the modify labels.
8.If the duplicated sublist still have duplicate elements the apply recursively the function to change the label of thoss repeated.