workpatterns_hclust: Create a hierarchical clustering of email or IMs by hour of...

View source: R/workpatterns_hclust.R

workpatterns_hclustR Documentation

Create a hierarchical clustering of email or IMs by hour of day

Description

[Experimental]

Apply hierarchical clustering to emails sent by hour of day. The hierarchical clustering uses cosine distance and the ward.D method of agglomeration.

Usage

workpatterns_hclust(
  data,
  k = 4,
  return = "plot",
  values = "percent",
  signals = "email",
  start_hour = "0900",
  end_hour = "1700"
)

Arguments

data

A data frame containing data from the Hourly Collaboration query.

k

Numeric vector to specify the k number of clusters to cut by.

return

String specifying what to return. This must be one of the following strings:

  • "plot"

  • "data"

  • "table"

  • "plot-area"

  • "hclust"

  • "dist"

See Value for more information.

values

Character vector to specify whether to return percentages or absolute values in "data" and "plot". Valid values are:

  • "percent": percentage of signals divided by total signals (default)

  • "abs": absolute count of signals

signals

Character vector to specify which collaboration metrics to use:

  • "email" (default) for emails only

  • "IM" for Teams messages only

  • "unscheduled_calls" for Unscheduled Calls only

  • "meetings" for Meetings only

  • or a combination of signals, such as c("email", "IM")

start_hour

A character vector specifying starting hours, e.g. "0900"

end_hour

A character vector specifying starting hours, e.g. "1700"

Details

The hierarchical clustering is applied on the person-average volume-based (pav) level. In other words, the clustering is applied on a dataset where the collaboration hours are averaged by person and calculated as % of total daily collaboration.

Value

A different output is returned depending on the value passed to the return argument:

  • "plot": ggplot object of a bar plot (default)

  • "data": data frame containing raw data with the clusters

  • "table": data frame containing a summary table. Percentages of signals are shown, e.g. x% of signals are sent by y hour of the day.

  • "plot-area": ggplot object. An overlapping area plot

  • "hclust": hclust object for the hierarchical model

  • "dist": distance matrix used to build the clustering model

See Also

Other Clustering: personas_hclust(), workpatterns_classify()

Other Working Patterns: flex_index(), identify_shifts_wp(), identify_shifts(), plot_flex_index(), workpatterns_area(), workpatterns_classify_bw(), workpatterns_classify_pav(), workpatterns_classify(), workpatterns_rank(), workpatterns_report()

Examples


# Run clusters, returning plot
workpatterns_hclust(em_data, k = 5, return = "plot")

# Run clusters, return raw data
workpatterns_hclust(em_data, k = 4, return = "data") %>% head()

# Run clusters for instant messages only, return hclust object
workpatterns_hclust(em_data, k = 4, return = "hclust", signals = c("IM"))



wpa documentation built on Aug. 21, 2023, 5:11 p.m.