dfm_tolower: Convert the case of the features of a dfm and combine
In quanteda: Quantitative Analysis of Textual Data

dfm_tolower

R Documentation

Convert the case of the features of a dfm and combine

Description

dfm_tolower() and dfm_toupper() convert the features of the dfm or fcm to lower and upper case, respectively, and then recombine the counts.

Usage

dfm_tolower(x, keep_acronyms = FALSE, verbose = quanteda_options("verbose"))

dfm_toupper(x, verbose = quanteda_options("verbose"))

fcm_tolower(x, keep_acronyms = FALSE, verbose = quanteda_options("verbose"))

fcm_toupper(x, verbose = quanteda_options("verbose"))

Arguments

`x`	the input object whose character/tokens/feature elements will be case-converted
`keep_acronyms`	logical; if `TRUE`, do not lowercase any all-uppercase words (applies only to `⁠*_tolower()⁠` functions)
`verbose`	if `TRUE` print the number of tokens and documents before and after the function is applied. The number of tokens does not include paddings.

Details

fcm_tolower() and fcm_toupper() convert both dimensions of the fcm to lower and upper case, respectively, and then recombine the counts. This works only on fcm objects created with context = "document".

Examples

# for a document-feature matrix
dfmat <- dfm(tokens(c("b A A", "C C a b B")), tolower = FALSE)
dfmat
dfm_tolower(dfmat)
dfm_toupper(dfmat)

# for a feature co-occurrence matrix
fcmat <- fcm(tokens(c("b A A d", "C C a b B e")),
             context = "document")
fcmat
fcm_tolower(fcmat)
fcm_toupper(fcmat)

quanteda documentation built on June 8, 2025, 9:41 p.m.