dictionary_meta: Assess Dictionary Categories Within a Latent Semantic Space

View source: R/dictionary_meta.R

dictionary_metaR Documentation

Assess Dictionary Categories Within a Latent Semantic Space

Description

Assess Dictionary Categories Within a Latent Semantic Space

Usage

dictionary_meta(dict, space = "auto", n_spaces = 5, suggest = FALSE,
  suggestion_terms = 10, suggest_stopwords = FALSE,
  suggest_discriminate = TRUE, expand_cutoff_freq = 0.98,
  expand_cutoff_spaces = 10, dimension_prop = 1, pairwise = TRUE,
  glob = TRUE, space_dir = getOption("lingmatch.lspace.dir"),
  verbose = TRUE)

Arguments

dict

A vector of terms, list of such vectors, or a matrix-like object to be categorized by read.dic.

space

A vector space used to calculate similarities between terms. Names of spaces (see select.lspace), a matrix with terms as row names, or "auto" to auto-select a space based on matched terms. This can also be multi to use multiple spaces, which are combined after similarities are calculated.

n_spaces

Number of spaces to draw from if space is multi.

suggest

Logical; if TRUE, will search for other terms for possible inclusion in space.

suggestion_terms

Number of terms to use when selecting suggested additions.

suggest_stopwords

Logical; if TRUE, will suggest function words.

suggest_discriminate

Logical; if TRUE, will adjust for similarity to other categories when finding suggestions.

expand_cutoff_freq

Proportion of mapped terms to include when expanding dictionary terms. Applies when space is a character (referring to a space to be loaded).

expand_cutoff_spaces

Number of spaces in which a term has to appear to be considered for expansion. Applies when space is a character (referring to a space to be loaded).

dimension_prop

Proportion of dimensions to use when searching for suggested additions, where less than 1 will calculate similarities to the category core using fewer dimensions of the space.

pairwise

Logical; if FALSE, will compare candidate suggestion terms with a single, averaged category vector rather than all category terms separately.

glob

Logical; if TRUE, converts globs (asterisk wildcards) to regular expressions.

space_dir

Directory from which space should be loaded.

verbose

Logical; if FALSE, will not show status messages.

Value

A list:

  • expanded: A version of dict with fuzzy terms expanded.

  • summary: A summary of each dictionary category.

  • terms: Match (expanded term) similarities within terms and categories.

  • suggested: If suggest is TRUE, a list with suggested additions for each dictionary category. Each entry is a named numeric vector with similarities for each suggested term.

See Also

To just expand fuzzy terms, see report_term_matches().

Similar information is provided in the dictionary builder web tool.

Other Dictionary functions: download.dict(), lma_patcat(), lma_termcat(), read.dic(), report_term_matches(), select.dict()

Examples

if (dir.exists("~/Latent Semantic Spaces")) {
  dict <- list(
    furniture = c("table", "chair", "desk*", "couch*", "sofa*"),
    well_adjusted = c("happy", "bright*", "friend*", "she", "he", "they")
  )
  dictionary_meta(dict, space_dir = "~/Latent Semantic Spaces")
}

lingmatch documentation built on May 29, 2024, 11:48 a.m.