as_universal: Convert Tags to Universal Tags

Description Usage Arguments Details Value References Examples

Description

Convert the Penn Treebank tags to universal part of speech tags.

Usage

1
2
as_universal(x, tagset = "en-ptb", dictionary = tagger::universal_pos_map,
  ...)

Arguments

x

A tag_pos object or a named list of vectors.

tagset

The name of a tagset dictionary to use as a key. Use names(universal_pos_map) to see possible choices.

dictionary

A dataframe that maps the current tagset to a second tagset.

...

ignored.

Details

Petrov, Das, & McDonald (2011) state that the universal tagset includes:

VERB

verbs (all tenses and modes)

NOUN

nouns (common and proper)

PRON

pronouns

ADJ

adjectives

ADV

adverbs

ADP

adpositions (prepositions and postpositions)

CONJ

conjunctions

DET

determiners

NUM

cardinal numbers

PRT

particles or other function words

X

other: foreign words, typos, abbreviations

.

punctuation

For more see: https://github.com/slavpetrov/universal-pos-tags

Value

Returns a combined character vector of words and universal tags.

References

Slav Petrov, Dipanjan Das and Ryan McDonald. (2011). A Universal Part-of-Speech Tagset. http://arxiv.org/abs/1104.2086

Examples

1
2
3
4
5
6
7
8
(x <- tag_pos("They refuse to permit us to obtain the refuse permit"))
as_universal(x)

(out1 <- tag_pos(sam_i_am))
as_universal(out1)

presidential_debates_2012_pos
as_universal(presidential_debates_2012_pos)

trinker/tagger documentation built on May 31, 2019, 10:42 p.m.