collocate_comments_fuzzy: Collocate Comments Fuzzy

View source: R/collocate_comments_fuzzy.R

collocate_comments_fuzzyR Documentation

Collocate Comments Fuzzy

Description

This function provides the frequency of collocations in comments that correspond to the provided transcript, using fuzzy matching.

Usage

collocate_comments_fuzzy(
  transcript_token,
  note_token,
  collocate_length = 5,
  n_bands = 50,
  threshold = 0.7
)

Arguments

transcript_token

transcript token to act as baseline for notes, resulting from token_transcript()

note_token

tokenized document of notes, resulting from token_comments()

collocate_length

the length of the collocation. Default is 5

n_bands

number of bands used in MinHash algorithm passed to zoomerjoin::jaccard_right_join(). Default is 50

threshold

considered a match in for Jaccard distance passed to zoomerjoin::jaccard_right_join(). Default is 0.7

Value

data frame of the transcript and corresponding note frequency

Examples

comment_example_rename <- dplyr::rename(comment_example[1:10,], page_notes=Notes)
toks_comment <- token_comments(comment_example_rename)
transcript_example_rename <- dplyr::rename(transcript_example, text=Text)
toks_transcript <- token_transcript(transcript_example_rename)
fuzzy_object <- collocate_comments_fuzzy(toks_transcript, toks_comment)

highlightr documentation built on June 27, 2025, 1:07 a.m.