Description Usage Arguments Value Examples
Clustering algorithm that produces a missing value imputation using on the go. The (local) imputation distribution is defined by the currently assigned cluster. The first draw is by random imputation.
1 2 3 4 5 6 7 8 9 10 11 | ClustImpute(
X,
nr_cluster,
nr_iter = 10,
c_steps = 1,
wf = default_wf,
n_end = 10,
seed_nr = 150519,
assign_with_wf = TRUE,
shrink_towards_global_mean = TRUE
)
|
X |
Data frame with only numeric values or NAs |
nr_cluster |
Number of clusters |
nr_iter |
Iterations of procedure |
c_steps |
Number of clustering steps per iteration |
wf |
Weight function. Linear up to n_end by default. Used to shrink X towards zero or the global mean (default). See shrink_towards_global_mean |
n_end |
Steps until convergence of weight function to 1 |
seed_nr |
Number for set.seed() |
assign_with_wf |
Default is TRUE. If set to False, then the weight function is only applied in the centroid computation, but ignored in the cluster assignment. |
shrink_towards_global_mean |
By default TRUE. The weight matrix w is applied on the difference of X from the global mean m, i.e, (x-m)*w+m |
Completed data without NAs
For each row of complete_data, the associated cluster
For each cluster, the coordinates of the centroids in tidy format
For each cluster, the coordinates of the centroids in matrix format
Mean of the imputed variables per draw
Standard deviation of the imputed variables per draw
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | # Random Dataset
set.seed(739)
n <- 750 # numer of points
nr_other_vars <- 2
mat <- matrix(rnorm(nr_other_vars*n),n,nr_other_vars)
me<-4 # mean
x <- c(rnorm(n/3,me/2,1),rnorm(2*n/3,-me/2,1))
y <- c(rnorm(n/3,0,1),rnorm(n/3,me,1),rnorm(n/3,-me,1))
dat <- cbind(mat,x,y)
dat<- as.data.frame(scale(dat)) # scaling
# Create NAs
dat_with_miss <- miss_sim(dat,p=.1,seed_nr=120)
# Run ClustImpute
res <- ClustImpute(dat_with_miss,nr_cluster=3)
# Plot complete data set and cluster assignment
ggplot2::ggplot(res$complete_data,ggplot2::aes(x,y,color=factor(res$clusters))) +
ggplot2::geom_point()
# View centroids
res$centroids
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.