get_word_matrices: Get word matrices

Description Usage Arguments Value Examples

View source: R/stories.R

Description

Get word matrices for stories.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
get_word_matrices(
  text = NULL,
  title = NULL,
  media_id = NULL,
  stories_id = NULL,
  after_date = NULL,
  before_date = NULL,
  n = 20,
  stopword_length = NULL,
  key = NULL,
  tibble = TRUE
)

Arguments

text

Optional character vector for full text search passed to the Solr query. If character vector contains more than one element, elements will be connected with OR.

title

Optional character vector for title search passed to the Solr query. If character vector contains more than one element, elements will be connected with OR.

media_id

Optional media ids (see search_media) passed to the Solr query. If vector contains more than one element, elements will be connected with OR.

stories_id

Optional stories ids passed to the Solr query. If vector contains more than one element, elements will be connected with OR.

after_date

Limit results to stories published after this date. Should be a date string that can be interpreted as a POSIXct object, e.g., '2021-01-01' or '2021-12-24 09:00:00'. Note that '00:00:00' will be added if only passing a date.

before_date

Limit results to stories published before this date. Should be a date string that can be interpreted as a POSIXct object, e.g., '2021-01-01' or '2021-12-24 09:00:00'. Note that '00:00:00' will be added if only passing a date.

n

Number of stories to search. Should be <= 1000.

stopword_length

if set to 'tiny', 'short', or 'long', eliminate stop word list of that length

key

MediaCloud API key. Will be read from environment variable 'MEDIACLOUD_API_KEY' if set to NULL (default).

tibble

Logical indicating whether result should be returned as a tibble. Default to TRUE. If set to FALSE, the unedited content of the HTTP response will be returned instead.

Value

A tibble with a Tidytext-style word matrix with one word per row and columns indicating the stories_id, the word_count, the word_stem, and the most common full_word associated with said stem. Use cast_dfm to transform into a Quanteda DFM.

Examples

1
2
3
4
5
## Not run: 
get_word_matrices(stories_id = c(1484325770, 24835747, 24840330))
get_word_matrices("football", after_date = "2020-01-01", before_date = "2020-01-02", media_id = 1)

## End(Not run)

joon-e/mediacloud documentation built on Jan. 8, 2022, 12:04 a.m.