orm_autodim: Automatic dimension extraction and risk cross-matrix

View source: R/orm_autodim.R

orm_autodimR Documentation

Automatic dimension extraction and risk cross-matrix

Description

orm_autodim() automatically discovers the most relevant contextual dimensions of a corpus using two complementary modes:

Mode 1: Dictionary blocks (default, method = "blocks") Uses the normative blocks of the ORISMA dictionary (A-Safety, B-Hygiene, C-Ergonomics, D-Psychosociology, E-Biological, F-Emerging) as dimensions. Computes a block x block co-occurrence matrix showing how many studies address combinations of risk blocks simultaneously. Works for any corpus without any configuration.

Mode 2: Free text (method = "text") Extracts discriminant terms from abstracts using TF-IDF-like filtering. Useful for discovering domain-specific dimensions not covered by the dictionary (e.g. specific materials, sectors, tasks).

Usage

orm_autodim(
  mx,
  method = "blocks",
  text_col = "abstract",
  n_dims = 12L,
  min_freq = 3L,
  max_doc_pct = 0.35,
  min_cooccur = 0.5,
  fuzzy_sim = 0.85,
  stopwords = NULL,
  lang = getOption("orisma.lang", "en"),
  verbose = getOption("orisma.verbose", TRUE)
)

Arguments

mx

An orisma_matrix object from orm_extract().

method

Character. "blocks" (default) or "text".

text_col

Character. Text field for method = "text". Default "abstract".

n_dims

Integer. Max dimensions for method = "text". Default 12.

min_freq

Integer. Min document frequency for method = "text". Default 3.

max_doc_pct

Numeric (0-1). Max document proportion for method = "text". Terms above this are too generic. Default 0.35.

min_cooccur

Numeric (0-1). Min co-occurrence with a risk. Default 0.5.

fuzzy_sim

Numeric (0-1). Fuzzy grouping threshold. Default 0.85.

stopwords

Character vector. Extra stopwords for method = "text".

lang

Character. "en" or "es".

verbose

Logical.

Value

A list (class orisma_dims) ready for orm_dim_matrix().

See Also

orm_dim_matrix()


orisma documentation built on May 19, 2026, 1:07 a.m.