tidy_specificities: Returns a tibble with specificities according to two crossed...
In lvaudor/mixr: Functions to simplify text mining with multiple languages

tidy_specificities

R Documentation

Returns a tibble with specificities according to two crossed categories.

Description

Returns a tibble with specificities according to two crossed categories.

Usage

tidy_specificities(mydf, cat1, cat2, top_spec = NA, min_spec = NA)

Arguments

`mydf`	a tibble
`cat1`	a factor corresponding to words or lemmas
`cat2`	a category
`top_spec`	how many items by category (filter based on specificity) should be kept. If not provided (the default) everything is kept.
`min_spec`	which is the minimum specificity for an item to be kept. If not provided (the default) everything is kept.

Value

tibble with additional columns cat1, cat2, spec

Examples

 mydf=dplyr::bind_rows(
         tibble::tibble(txt=janeaustenr::prideprejudice,
         novel="Pride and Prejudice"),
         tibble::tibble(txt=janeaustenr::sensesensibility,
         novel="Sense and Sensibility")) %>%
      tidytext::unnest_tokens(word,txt)
 tidy_specificities(mydf,
                    cat1=word,
                    cat2=novel)

lvaudor/mixr documentation built on April 14, 2024, 2:17 p.m.