Description Usage Arguments Value Author(s) See Also Examples
View source: R/parse.pos.tags.R
Function for extracting textual data from annotated corpora. It uderstands Stanford Tagger, TreeTagger TaKIPI (a tagger for Polish), and Alpino (a tagger for Dutch) output formats. Either part-of-speech tags, or words, or lemmata can be extracted.
1 | parse.pos.tags(input.text, tagger = "stanford", feature = "pos")
|
input.text |
any string of characters (e.g. vector) containing markup tags that have to be deleted. |
tagger |
choose the input format: "stanford" for Stanford Tagger, "treetagger" for TreeTagger, "takipi" for TaKIPI. |
feature |
choose "pos" (default), "word", or "lemma" (this one is not available for the Stanford-formatted input). |
If the function is applied to a single text, then a vector of extracted features is returned. If it is applied to a corpus (a list, preferably of a class "stylo.corpus"), then a list of preprocessed texts are returned.
Maciej Eder
load.corpus
, txt.to.words
,
txt.to.words.ext
, txt.to.features
1 2 3 4 5 6 7 8 9 10 11 | text = "I_PRP have_VBP just_RB returned_VBN from_IN a_DT visit_NN
to_TO my_PRP$ landlord_NN -_: the_DT solitary_JJ neighbor_NN that_IN
I_PRP shall_MD be_VB troubled_VBN with_IN ._. This_DT is_VBZ certainly_RB
a_DT beautiful_JJ country_NN !_. In_IN all_DT England_NNP ,_, I_PRP do_VBP
not_RB believe_VB that_IN I_PRP could_MD have_VB fixed_VBN on_IN a_DT
situation_NN so_RB completely_RB removed_VBN from_IN the_DT stir_VB of_IN
society_NN ._."
parse.pos.tags(text, tagger = "stanford", feature = "word")
parse.pos.tags(text, tagger = "stanford", feature = "pos")
|
stylo version: 0.6.4
Warning message:
no DISPLAY variable so Tk is not available
[1] "I have just returned from a visit \n to my landlord - the solitary neighbor that \n I shall be troubled with . This is certainly \n a beautiful country ! In all England , I do \n not believe that I could have fixed on a \n situation so completely removed from the stir of \n society . "
[1] " PRP VBP RB VBN IN DT NN TO PRP$ NN : DT JJ NN IN PRP MD VB VBN IN . DT VBZ RB DT JJ NN . IN DT NNP , PRP VBP RB VB IN PRP MD VB VBN IN DT NN RB RB VBN IN DT VB IN NN ."
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.