Description Usage Details Source Examples
Tag sets frequently used in Natural Language Processing.
1 2 3 4 |
Penn_Treebank_POS_tags
and Brown_POS_tags
provide,
respectively, the Penn Treebank POS tags
(https://catalog.ldc.upenn.edu/docs/LDC95T7/cl93.html, Table 2)
and the POS tags used for the Brown corpus
(http://www.hit.uib.no/icame/brown/bcm.html),
both as data frames with the following variables:
a character vector with the POS tags
a character vector with short descriptions of the tags
a character vector with examples for the tags
Universal_POS_tags
provides the universal POS tagset introduced
by Slav Petrov, Dipanjan Das, and Ryan McDonald
(https://arxiv.org/abs/1104.2086), as a data frame with character
variables entry
and description
.
Universal_POS_tags_map
is a named list of mappings from
language and treebank specific POS tagsets to the universal POS tags,
with elements named en-ptb and en-brown giving the
mappings, respectively, for the Penn Treebank and Brown POS tags.
https://catalog.ldc.upenn.edu/docs/LDC95T7/cl93.html, http://www.hit.uib.no/icame/brown/bcm.html, https://github.com/slavpetrov/universal-pos-tags.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ## Penn Treebank POS tags
dim(Penn_Treebank_POS_tags)
## Inspect first 20 entries:
write.dcf(head(Penn_Treebank_POS_tags, 20L))
## Brown POS tags
dim(Brown_POS_tags)
## Inspect first 20 entries:
write.dcf(head(Brown_POS_tags, 20L))
## Universal POS tags
Universal_POS_tags
## Available mappings to universal POS tags
names(Universal_POS_tags_map)
|
[1] 45 3
entry: $
description: dollar
examples: $ -$ --$ A$ C$ HK$ M$ NZ$ S$ U.S.$ US$
entry: ``
description: opening quotation mark
examples: ` ``
entry: ''
description: closing quotation mark
examples: ' ''
entry: (
description: opening parenthesis
examples: ( [ {
entry: )
description: closing parenthesis
examples: ) ] }
entry: ,
description: comma
examples: ,
entry: -
description: dash
examples: -
entry: .
description: sentence terminator
examples: . ! ?
entry: :
description: colon or ellipsis
examples: : ; ...
entry: CC
description: conjunction, coordinating
examples: & 'n and both but either et for less minus neither nor or
plus so therefore times v. versus vs. whether yet
entry: CD
description: numeral, cardinal
examples: mid-1890 nine-thirty forty-two one-tenth ten million 0.5 one
forty-seven 1987 twenty '79 zero two 78-degrees eighty-four IX
'60s .025 fifteen 271,124 dozen quintillion DM2,000 ...
entry: DT
description: determiner
examples: all an another any both del each either every half la many
much nary neither no some such that the them these this those
entry: EX
description: existential there
examples: there
entry: FW
description: foreign word
examples: gemeinschaft hund ich jeux habeas Haementeria Herr K'ang-si
vous lutihaw alai je jour objets salutaris fille quibusdam pas
trop Monte terram fiche oui corporis ...
entry: IN
description: preposition or conjunction, subordinating
examples: astride among uppon whether out inside pro despite on by
throughout below within for towards near behind atop around if
like until below next into if beside ...
entry: JJ
description: adjective or numeral, ordinal
examples: third ill-mannered pre-war regrettable oiled calamitous first
separable ectoplasmic battery-powered participatory fourth
still-to-be-named multilingual multi-disciplinary ...
entry: JJR
description: adjective, comparative
examples: bleaker braver breezier briefer brighter brisker broader
bumper busier calmer cheaper choosier cleaner clearer closer
colder commoner costlier cozier creamier crunchier cuter ...
entry: JJS
description: adjective, superlative
examples: calmest cheapest choicest classiest cleanest clearest closest
commonest corniest costliest crassest creepiest crudest cutest
darkest deadliest dearest deepest densest dinkiest ...
entry: LS
description: list item marker
examples: A A. B B. C C. D E F First G H I J K One SP-44001 SP-44002
SP-44005 SP-44007 Second Third Three Two * a b c d first five
four one six three two
entry: MD
description: modal auxiliary
examples: can cannot could couldn't dare may might must need ought
shall should shouldn't will would
[1] 226 3
entry: (
description: opening parenthesis
examples: (
entry: )
description: closing parenthesis
examples: )
entry: *
description: negator
examples: not n't
entry: ,
description: comma
examples: ,
entry: --
description: dash
examples: --
entry: .
description: sentence terminator
examples: . ? ; ! :
entry: :
description: colon
examples: :
entry: ABL
description: determiner/pronoun, pre-qualifier
examples: quite such rather
entry: ABN
description: determiner/pronoun, pre-quantifier
examples: all half many nary
entry: ABX
description: determiner/pronoun, double conjunction or pre-quantifier
examples: both
entry: AP
description: determiner/pronoun, post-determiner
examples: many other next more last former little several enough most
least only very few fewer past same Last latter less single
plenty 'nough lesser certain various manye next-to-last
particular final previous present nuf
entry: AP$
description: determiner/pronoun, post-determiner, genitive
examples: other's
entry: AP+AP
description: determiner/pronoun, post-determiner, hyphenated pair
examples: many-much
entry: AT
description: article
examples: the an no a every th' ever' ye
entry: BE
description: verb "to be", infinitive or imperative
examples: be
entry: BED
description: verb "to be", past tense, 2nd person singular or all
persons plural
examples: were
entry: BED*
description: verb "to be", past tense, 2nd person singular or all
persons plural, negated
examples: weren't
entry: BEDZ
description: verb "to be", past tense, 1st and 3rd person singular
examples: was
entry: BEDZ*
description: verb "to be", past tense, 1st and 3rd person singular,
negated
examples: wasn't
entry: BEG
description: verb "to be", present participle or gerund
examples: being
entry description
1 VERB verbs (all tenses and modes)
2 NOUN nouns (common and proper)
3 PRON pronouns
4 ADJ adjectives
5 ADV adverbs
6 ADP adpositions (prepositions and postpositions)
7 CONJ conjunctions
8 DET determiners
9 NUM cardinal numbers
10 PRT particles or other function words
11 X other: foreign words, typos, abbreviations
12 . punctuation
[1] "ar-padt" "bg-btb" "ca-cat3lb" "cs-pdt"
[5] "da-ddt" "de-negra" "de-tiger" "el-gdt"
[9] "en-brown" "en-ptb" "en-tweet" "es-cast3lb"
[13] "eu-eus3lb" "fi-tdt" "fr-paris" "hu-szeged"
[17] "it-isst" "iw-mila" "ja-kyoto" "ja-verbmobil"
[21] "ko-sejong" "nl-alpino" "pl-ipipan" "pt-bosque"
[25] "ru-rnc" "sl-sdt" "sv-talbanken" "tu-metusbanci"
[29] "zh-ctb6" "zh-sinica"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.