parse.pos.tags: Extract POS-tags or Words from Annotated Corpora

Description Usage Arguments Value Author(s) See Also Examples

View source: R/parse.pos.tags.R

Description

Function for extracting textual data from annotated corpora. It uderstands Stanford Tagger, TreeTagger TaKIPI (a tagger for Polish), and Alpino (a tagger for Dutch) output formats. Either part-of-speech tags, or words, or lemmata can be extracted.

Usage

1
parse.pos.tags(input.text, tagger = "stanford", feature = "pos")

Arguments

input.text

any string of characters (e.g. vector) containing markup tags that have to be deleted.

tagger

choose the input format: "stanford" for Stanford Tagger, "treetagger" for TreeTagger, "takipi" for TaKIPI.

feature

choose "pos" (default), "word", or "lemma" (this one is not available for the Stanford-formatted input).

Value

If the function is applied to a single text, then a vector of extracted features is returned. If it is applied to a corpus (a list, preferably of a class "stylo.corpus"), then a list of preprocessed texts are returned.

Author(s)

Maciej Eder

See Also

load.corpus, txt.to.words, txt.to.words.ext, txt.to.features

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
text = "I_PRP have_VBP just_RB returned_VBN from_IN a_DT visit_NN 
  to_TO my_PRP$ landlord_NN -_: the_DT solitary_JJ neighbor_NN  that_IN 
  I_PRP shall_MD be_VB troubled_VBN with_IN ._. This_DT is_VBZ certainly_RB 
  a_DT beautiful_JJ country_NN !_. In_IN all_DT England_NNP ,_, I_PRP do_VBP 
  not_RB believe_VB that_IN I_PRP could_MD have_VB fixed_VBN on_IN a_DT 
  situation_NN so_RB completely_RB removed_VBN from_IN the_DT stir_VB of_IN 
  society_NN ._."

parse.pos.tags(text, tagger = "stanford", feature = "word")
parse.pos.tags(text, tagger = "stanford", feature = "pos")
  

Example output

stylo version: 0.6.4
Warning message:
no DISPLAY variable so Tk is not available 
[1] "I have just returned from a visit \n  to my landlord - the solitary neighbor  that \n  I shall be troubled with . This is certainly \n  a beautiful country ! In all England , I do \n  not believe that I could have fixed on a \n  situation so completely removed from the stir of \n  society . "
[1] " PRP VBP RB VBN IN DT NN TO PRP$ NN : DT JJ NN IN PRP MD VB VBN IN . DT VBZ RB DT JJ NN . IN DT NNP , PRP VBP RB VB IN PRP MD VB VBN IN DT NN RB RB VBN IN DT VB IN NN ."

stylo documentation built on Dec. 6, 2020, 5:06 p.m.

Related to parse.pos.tags in stylo...