fst_freq_compare: Compare and plot top words

View source: R/04_comparison_functions.R

fst_freq_compareR Documentation

Compare and plot top words

Description

Find top and unique top words for different groups of participants. Data is split based on different values in the 'field' column of formatted data. Results will be shown within the plots pane.

Usage

fst_freq_compare(
  data,
  field,
  number = 10,
  norm = NULL,
  pos_filter = NULL,
  strict = TRUE,
  use_svydesign_weights = FALSE,
  use_svydesign_field = FALSE,
  id = "",
  svydesign = NULL,
  use_column_weights = FALSE,
  exclude_nulls = FALSE,
  rename_nulls = "null_data",
  unique_colour = "indianred",
  title_size = 20,
  subtitle_size = 15
)

Arguments

data

A dataframe of text in CoNLL-U format with additional 'field' column for splitting data.

field

Column in 'data' used for splitting groups

number

The number of n-grams to return, default is '10'.

norm

The method for normalising the data. Valid settings are '"number_words"' (the number of words in the responses), '"number_resp"' (the number of responses), or 'NULL' (raw count returned, default, also used when weights are applied).

pos_filter

List of UPOS tags for inclusion, default is 'NULL' which means all word types included.

strict

Whether to strictly cut-off at 'number' (ties are alphabetically ordered), default is 'TRUE'.

use_svydesign_weights

Option to weight words in the wordcloud using weights from a svydesign object containing the raw data, default is 'FALSE'

use_svydesign_field

Option to get 'field' for splitting the data from the svydesign object, default is 'FALSE'

id

ID column from raw data, required if 'use_svydesign_weights = TRUE' and must match the 'docid' in formatted 'data'.

svydesign

A svydesign object which contains the raw data and weights.

use_column_weights

Option to weight words in the wordcloud using weights from formatted data which includes addition 'weight' column, default is 'FALSE'

exclude_nulls

Whether to include NULLs in 'field' column, default is 'FALSE'

rename_nulls

What to fill NULL values with if 'exclude_nulls = FALSE'.

unique_colour

Colour to display unique words, default is '"indianred"'.

title_size

size to display plot title

subtitle_size

size to display title of individual top words plot

Value

Plots of most frequent words in the plots pane with unique words highlighted.

Examples

fst_freq_compare(fst_child, 'gender', number = 10, norm = "number_resp")
fst_freq_compare(fst_child, 'gender', number = 10, norm = NULL)
s <- survey::svydesign(id=~1, weights= ~paino, data = child)
c2 <- fst_child_2
c <- fst_child
g <- 'gender'
fst_freq_compare(c2, g, 10, NULL, NULL, TRUE, TRUE, TRUE, 'fsd_id', s)
fst_freq_compare(c, g, use_column_weights = TRUE, strict = FALSE)

finnsurveytext documentation built on April 4, 2025, 5:07 a.m.