features: Extract Annotation Features

Description Usage Arguments Details Examples

View source: R/annotation.R

Description

Conveniently extract features from annotations and annotated plain text documents.

Usage

1
features(x, type = NULL, simplify = TRUE)

Arguments

x

an object inheriting from class "Annotation" or "AnnotatedPlainTextDocument".

type

a character vector of annotation types to be used for selecting annotations, or NULL (default) to use all annotations. When selecting, the elements of type will partially be matched against the annotation types.

simplify

a logical indicating whether to simplify feature values to a vector.

Details

features() conveniently gathers all feature tag-value pairs in the selected annotations into a data frame with variables the values for all tags found (using a NULL value for tags without a value). In general, variables will be lists of extracted values. By default, variables where all elements are length one atomic vectors are simplified into an atomic vector of values. The values for specific tags can be extracted by suitably subscripting the obtained data frame.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## Use a pre-built annotated plain text document,
## see ? AnnotatedPlainTextDocument.
doc <- readRDS(system.file("texts", "stanford.rds", package = "NLP"))
## Extract features of all *word* annotations in doc:
x <- features(doc, "word")
## Could also have abbreviated "word" to "w".
x
## Only lemmas:
x$lemma
## Words together with lemmas:
paste(words(doc), x$lemma, sep = "/")

Example output

   POS      lemma
1  NNP   Stanford
2  NNP University
3  VBZ         be
4   JJ    located
5   IN         in
6  NNP California
7    .          .
8  PRP         it
9  VBZ         be
10  DT          a
11  JJ      great
12  NN university
13   .          .
 [1] "Stanford"   "University" "be"         "located"    "in"        
 [6] "California" "."          "it"         "be"         "a"         
[11] "great"      "university" "."         
 [1] "Stanford/Stanford"     "University/University" "is/be"                
 [4] "located/located"       "in/in"                 "California/California"
 [7] "./."                   "It/it"                 "is/be"                
[10] "a/a"                   "great/great"           "university/university"
[13] "./."                  

NLP documentation built on Oct. 23, 2020, 6:18 p.m.

Related to features in NLP...