Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/discretize_jointly.R
Discretize multivariate continuous data using a grid that captures the joint distribution via preserving clusters in the original data
1  discretize.jointly(data, k = c(2:10), cluster_label = NULL, min_level = 2)

data 
a matrix containing two or more continuous variables. Columns are variables, rows are observations. 
k 
either the number or range of clusters to be found on 
cluster_label 
a vector of userspecified cluster labels for each observation
in 
min_level 
the minimum number of levels along each dimension 
The function implements algorithms described in \insertCiteJwang2020BCBGridOnClusters.
A list that contains four items:

a matrix that contains the discretized version of the original 

a list of vectors containing decision boundaries for each variable/dimension. 

a vector containing cluster labels for each observation in 

a similarity score between clusters from joint discretization

Jiandong Wang, Sajal Kumar and Mingzhou Song
See Ckmeans.1d.dp for discretizing univariate continuous data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24  # using a specified k
x = rnorm(100)
y = sin(x)
z = cos(x)
data = cbind(x, y, z)
discretized_data = discretize.jointly(data, k=5)$D
# using a range of k
x = rnorm(1000)
y = log1p(abs(x))
z = tan(x)
data = cbind(x, y, z)
discretized_data = discretize.jointly(data, k=c(3:10))$D
# using an alternate clustering method to kmeans
library(cluster)
x = rnorm(1000)
y = log1p(abs(x))
z = sin(x)
data = cbind(x, y, z)
# precluster the data using partition around medoids (PAM)
cluster_label = pam(x=data, diss = FALSE, metric = "euclidean", k = 5)$clustering
discretized_data = discretize.jointly(data, cluster_label = cluster_label)$D

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.