View source: R/txt_recode_fast.R
| process_multiwords_fast | R Documentation |
Complete optimized workflow for multiword detection and processing. Uses C++ functions and data.table for maximum performance.
process_multiwords_fast(x2, stats, term = c("lemma", "token"))
x2 |
Data frame with token information |
stats |
Data frame with multiword statistics (keyword, ngram columns) |
term |
Type of term to process: "lemma" or "token" |
This function replaces the original switch block with an optimized version that uses:
C++ functions for text recoding
Vectorized operations instead of multiple mutate calls
Pre-computed lookups to avoid repeated joins
Data frame with columns: doc_id, term_id, multiword, upos_multiword, ngram
## Not run:
result <- process_multiwords_fast(dfTag, multiword_stats, term = "lemma")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.