dfm | R Documentation |
Construct a sparse document-feature matrix from a tokens or dfm object.
dfm(
x,
tolower = TRUE,
remove_padding = FALSE,
verbose = quanteda_options("verbose"),
...
)
x |
a tokens or dfm object. |
tolower |
convert all features to lowercase. |
remove_padding |
logical; if |
verbose |
display messages if |
... |
not used. |
a dfm object
In quanteda v4, many convenience functions formerly available in
dfm()
were removed.
as.dfm()
, dfm_select()
, dfm
## for a corpus
toks <- data_corpus_inaugural |>
corpus_subset(Year > 1980) |>
tokens()
dfm(toks)
# removal options
toks <- tokens(c("a b c", "A B C D")) |>
tokens_remove("b", padding = TRUE)
toks
dfm(toks)
dfm(toks) |>
dfm_remove(pattern = "") # remove "pads"
# preserving case
dfm(toks, tolower = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.