get_kmeans: Get a set k-means++ clusters for a collection of series

Description Usage Arguments Examples

Description

Get a set k-means++ clusters for a collection of series

Usage

1
get_kmeans(dat, x, y, groups = NULL, k = 2:20)

Arguments

dat

data frame

x

string name of x variable

y

string name of y variable

groups

a character vector of grouping variable names - if not specified, all columns that are not x or y will be treated as grouping variables

k

vector of number of clusters to run through

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
## Not run: 
library(dplyr)

# scale the monthly median close price so that we are clustering on general shape
d <- nasd16 %>%
  group_by(symbol) %>%
  mutate(close_scl = as.numeric(scale(med_close))) %>%
  select(-company, -med_close) %>%
  ungroup()

set.seed(1234)
# k-means clustering with 2, 5, and 9 clusters
km <- get_kmeans(d, x = "month", y = "close_scl", k = c(2, 5, 9, 25))
plot_scree(km)
plot_heat(km, 9, col = "sector")
plot_heat(km, 9, col = "sector", interactive = FALSE,
  display_numbers = TRUE, cutree_cols = 3, cutree_rows = 3)
plot_heat(km, 9, col = "industry", cutoff = 20)

heat <- plot_heat(km, 9, col = "sector", interactive = FALSE,
  display_numbers = TRUE, cutree_cols = 3, cutree_rows = 3,
  annotation_labs = c("A", "B", "C"))
heat

cents <- get_centroid_data(km, 9)
plot_centroid_groups(cents, heat)

## End(Not run)

hafen/seriesclust documentation built on May 17, 2019, 2:24 p.m.