View source: R/nlp_melt_tokens.R
nlp_melt_tokens | R Documentation |
This function tokenizes a data frame based on a specified token column and groups the data by one or more specified columns.
nlp_melt_tokens(
df,
melt_col = "token",
parent_cols = c("doc_id", "sentence_id")
)
df |
A data frame containing the data to be tokenized. |
melt_col |
The name of the column in 'df' that contains the tokens. |
parent_cols |
A character vector indicating the column(s) by which to group the data. |
A list of vectors, each containing the tokens of a group defined by the 'by' parameter.
dtm <- data.frame(doc_id = as.character(c(1, 1, 1, 1, 1, 1, 1, 1)),
sentence_id = as.character(c(1, 1, 1, 2, 2, 2, 2, 2)),
token = c("Hello", "world", ".", "This", "is", "an", "example", "."))
tokens <- nlp_melt_tokens(dtm, melt_col = 'token', parent_cols = c('doc_id', 'sentence_id'))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.