covars_make_pos: calculate part-of-speech information from text snippet data

Description Usage Arguments Details Value Examples

View source: R/covars_make_pos.R

Description

Add additional variables consisting of part-of-speech (POS) frequencies to snippets.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
covars_make_pos(x, ...)

## S3 method for class 'snippet'
covars_make_pos(x, ...)

## S3 method for class 'corpus'
covars_make_pos(x, ...)

## S3 method for class 'data.frame'
covars_make_pos(x, text_field = "text", ...)

## S3 method for class 'character'
covars_make_pos(
  x,
  text_field = "text",
  dependency = TRUE,
  normalize = TRUE,
  ...
)

Arguments

x

snippet data from snippets_make() consisting of the fields text, docID, and snippetID

...

used to pass the tagset argument to spacyr::spacy_parse(), for example tagset = "penn" to specify the Penn Treebank tagset scheme, instead of the Google universal tagset default. See spacyr::spacy_parse().

text_field

the name of the text field, if a data.frame, default is "text"

dependency

logical; if TRUE parse dependencies

normalize

if TRUE, convert pos tag counts to rates

Details

Note that this requires spaCy to be installed (along with Python). See the installation instructions at http://github.com/kbenoit/spacyr.

Value

the data.frame of added variables, consisting of the frequencies of parts of speech in each text

Examples

1
2
3
4
## Not run: 
# some examples here

## End(Not run)

kbenoit/sophistication documentation built on May 12, 2021, 5:57 a.m.