get_count_vector: Get Vector with Counts for Positional Attribute.

View source: R/count.R

get_count_vectorR Documentation

Get Vector with Counts for Positional Attribute.

Description

The return value is an integer vector. The length of the vector is the number of unique tokens in the corpus / the number of unique ids. The order of the counts corresponds to the number of ids.

Usage

get_count_vector(corpus, p_attribute, registry = Sys.getenv("CORPUS_REGISTRY"))

Arguments

corpus

a CWB corpus

p_attribute

a positional attribute

registry

registry directory

Value

an integer vector

Examples

y <- get_count_vector(
  corpus = "REUTERS", p_attribute = "word",
  registry = get_tmp_registry()
  )
df <- data.frame(token_id = 0:(length(y) - 1), count = y)
df[["token"]] <- cl_id2str(
  "REUTERS", p_attribute = "word",
  id = df[["token_id"]], registry = get_tmp_registry()
  )
df <- df[,c("token", "token_id", "count")] # reorder columns
df <- df[order(df[["count"]], decreasing = TRUE),]
head(df)

RcppCWB documentation built on Sept. 24, 2024, 1:08 a.m.