CoCA: Performs Concept Class Analysis (CoCA)

View source: R/CoCA.R

CoCAR Documentation

Performs Concept Class Analysis (CoCA)

Description

CoCA outputs schematic classes derived from documents' engagement with multiple bi-polar concepts (in a Likert-style fashion). The function requires a (1) DTM of a corpus which can be obtained using any popular text analysis package, or from the dtm_builder() function, and (2) semantic directions as output from the get_direction(). CMDist() works under the hood. Code modified from the corclass package.

Usage

CoCA(
  dtm,
  wv = NULL,
  directions = NULL,
  filter_sig = TRUE,
  filter_value = 0.05,
  zero_action = c("drop", "ownclass")
)

Arguments

dtm

Document-term matrix with words as columns. Works with DTMs produced by any popular text analysis package, or you can use the dtm_builder() function.

wv

Matrix of word embedding vectors (a.k.a embedding model) with rows as words.

directions

direction vectors output from get_direction()

filter_sig

logical (default = TRUE), sets 'insignificant' ties to 0 to decrease noise and increase stability

filter_value

Minimum significance cutoff. Absolute row correlations below this value will be set to 0

zero_action

If 'drop', CCA drops rows with 0 variance from the analyses (default). If 'ownclass', the correlations between 0-variance rows and all other rows is set 0, and the correlations between all pairs of 0-var rows are set to 1

Value

Returns a named list object of class CoCA. List elements include:

  • membership: document memberships

  • modules: schematic classes

  • cormat: correlation matrix

Author(s)

Dustin Stoltz and Marshall Taylor

References

Taylor, Marshall A., and Dustin S. Stoltz. (2020) 'Concept Class Analysis: A Method for Identifying Cultural Schemas in Texts.' Sociological Science 7:544-569. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.15195/v7.a23")}.
Boutyline, Andrei. 'Improving the measurement of shared cultural schemas with correlational class analysis: Theory and method.' Sociological Science 4.15 (2017): 353-393. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.15195/v4.a15")}

See Also

CMDist, get_direction

Examples


#' # load example word embeddings
data(ft_wv_sample)

# load example text
data(jfk_speech)

# minimal preprocessing
jfk_speech$sentence <- tolower(jfk_speech$sentence)
jfk_speech$sentence <- gsub("[[:punct:]]+", " ", jfk_speech$sentence)

# create DTM
dtm <- dtm_builder(jfk_speech, sentence, sentence_id)

# create semantic directions
gen <- data.frame(
  add = c("woman"),
  subtract = c("man")
)

die <- data.frame(
  add = c("alive"),
  subtract = c("die")
)

gen.dir <- get_direction(anchors = gen, wv = ft_wv_sample)
die.dir <- get_direction(anchors = die, wv = ft_wv_sample)

sem_dirs <- rbind(gen.dir, die.dir)

classes <- CoCA(
  dtm = dtm,
  wv = ft_wv_sample,
  directions = sem_dirs,
  filter_sig = TRUE,
  filter_value = 0.05,
  zero_action = "drop"
)

print(classes)

text2map documentation built on July 9, 2023, 6:35 p.m.