subject_scan: Count top words in subject lines grouped by a custom...

View source: R/subject_scan.R

subject_scanR Documentation

Count top words in subject lines grouped by a custom attribute

Description

[Experimental]

This function generates a matrix of the top occurring words in meetings, grouped by a specified attribute such as organisational attribute, day of the week, or hours of the day.

Usage

subject_scan(
  data,
  hrvar,
  mode = NULL,
  top_n = 10,
  token = "words",
  return = "plot",
  weight = NULL,
  stopwords = NULL,
  ...
)

tm_scan(
  data,
  hrvar,
  mode = NULL,
  top_n = 10,
  token = "words",
  return = "plot",
  weight = NULL,
  stopwords = NULL,
  ...
)

Arguments

data

A Meeting Query dataset in the form of a data frame.

hrvar

String containing the name of the HR Variable by which to split metrics. Note that the prefix 'Organizer_' or equivalent will be required.

mode

String specifying what variable to use for grouping subject words. Valid values include:

  • "hours"

  • "days"

  • NULL (defaults to hrvar) When the value passed to mode is not NULL, the value passed to hrvar will be discarded and instead be over-written by setting specified in mode.

top_n

Numeric value specifying the top number of words to show.

token

A character vector accepting either "words" or "ngrams", determining type of tokenisation to return.

return

String specifying what to return. This must be one of the following strings:

  • "plot"

  • "table"

  • "data"

See Value for more information.

weight

String specifying the column name of a numeric variable for weighting data, such as "Invitees". The column must contain positive integers. Defaults to NULL, where no weighting is applied.

stopwords

A character vector OR a single-column data frame labelled 'word' containing custom stopwords to remove.

...

Additional parameters to pass to tm_clean().

Value

A different output is returned depending on the value passed to the return argument:

  • "plot": 'ggplot' object. A heatmapped grid.

  • "table": data frame. A summary table for the metric.

  • "data": data frame.

Examples


# return a heatmap table for words
mt_data %>% subject_scan(hrvar = "Organizer_Organization")

# return a heatmap table for ngrams
mt_data %>%
  subject_scan(
    hrvar = "Organizer_Organization",
    token = "ngrams",
    n = 2)

# return raw table format
mt_data %>% subject_scan(hrvar = "Organizer_Organization", return = "table")

# grouped by hours
mt_data %>% subject_scan(mode = "hours")

# grouped by days
mt_data %>% subject_scan(mode = "days")


wpa documentation built on Aug. 21, 2023, 5:11 p.m.