README.md

kmeans_R

Build Status GitHub issues

Installation

Install this package directly from GitHub:

devtools::install_github("UBC-MDS/kmeans_R")

Usage

Simple example demonstrating the functionality of this package:

# load package                                                   
library(kmeansR)                                                 

# generate synthetic data with three clusters                    
synth_data <- data.frame(                                        
x = c(rnorm(20,1,1), rnorm(30,6,3), rnorm(15,10,2)),             
y = c(rnorm(20,5,2), rnorm(30,2,2), rnorm(15,8,3))               
)                                                                

# initialize the cluster centers                                 
centers <- kmeans_init(data = synth_data, K = 3)                 
# cluster the data points                                        
clustered <- kmeans_cluster(data = synth_data, centers = centers)
# generate summary results                                       
report <- kmeans_report(clustered_data = clustered)              

# plot the clustered data                                        
report$plot                                                      

report$summary                                                   
#> # A tibble: 3 x 2
#>   cluster count
#>   <fct>   <int>
#> 1 1          31
#> 2 2          12
#> 3 3          22

Overview

kmeans_R is an R package aimed towards a user-friendly way of exploring and implementing k-means clustering.

The package offers simple and easy to use functions that perform k-means clustering. In particular, the different stages of clustering are broken up into separate functions (initialization, clustering, and plotting). This allows the user to investigate exactly what is going on at each step, which promotes an understanding of this disparate aspects of the clustering procedure. Furthermore, the plotting (perhaps the most rewarding part of the process) can be done easily - assuming we are in two dimensions - and results in visually appealing images (thanks to ggplot2) An example of how this organizational pattern could prove useful is as an aid to understanding kmeans clustering. Other packages in the R ecosystem that are related/overlap with this package are: kmeans and KMeans_rcpp.

The package includes the following functions:

Contributors

Bradley Pick

Charley Carriero

Johannes Harmse



UBC-MDS/kmeans_R documentation built on May 22, 2019, 2:26 p.m.