README.md

RDRPOSTagger

R package to perform Parts of Speech tagging and morphological tagging based on the Ripple Down Rules-based Part-Of-Speech Tagger (RDRPOS) available at https://github.com/datquocnguyen/RDRPOSTagger. RDRPOSTagger supports pre-trained POS tagging models for 45 languages.

The R package allows you to perform 3 types of tagging.

This is based on corpora collected and made available at http://universaldependencies.org.

Examples on Parts of Speech tagging

The following shows how to use the package

```() library(RDRPOSTagger) models <- rdr_available_models() models$MORPH$language models$POS$language models$UniversalPOS$language

x <- c("Oleg Borisovich Kulik is a Ukrainian-born Russian performance artist") tagger <- rdr_model(language = "English", annotation = "POS") rdr_pos(tagger, x = x)

x <- c("Dus godvermehoeren met pus in alle puisten, zei die schele van Van Bukburg.", "Er was toen dat liedje van tietenkonttieten kont tieten kontkontkont", " ", "", NA) tagger <- rdr_model(language = "Dutch", annotation = "MORPH") rdr_pos(tagger, x = x)

tagger <- rdr_model(language = "Dutch", annotation = "UniversalPOS") rdr_pos(tagger, x = x)


The output of the POS tagging shows the following elements:

doc_id token_id token pos d1 1 Dus ADV d1 2 godvermehoeren VERB d1 3 met ADP d1 4 pus NOUN d1 5 in ADP d1 6 alle PRON d1 7 puisten NOUN d1 8 , PUNCT d1 9 zei VERB d1 10 die PRON d1 11 schele ADJ d1 12 van ADP d1 13 Van PROPN d1 14 Bukburg PROPN d1 15 . PUNCT d2 1 Er ADV d2 2 was AUX d2 3 toen SCONJ d2 4 dat SCONJ d2 5 liedje NOUN d2 6 van ADP d2 7 tietenkonttieten VERB d2 8 kont PROPN d2 9 tieten VERB d2 10 kontkontkont PROPN d3 0 d4 0 d5 0


More information about the model and the tagging can be found at https://github.com/datquocnguyen/RDRPOSTagger

The general architecture and experimental results of RDRPOSTagger can be found in the following papers:

- Dat Quoc Nguyen, Dai Quoc Nguyen, Dang Duc Pham and Son Bao Pham. [RDRPOSTagger: A Ripple Down Rules-based Part-Of-Speech Tagger](http://www.aclweb.org/anthology/E14-2005). In *Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics*, EACL 2014, pp. 17-20, 2014. [[.PDF]](http://www.aclweb.org/anthology/E14-2005) [[.bib]](http://www.aclweb.org/anthology/E14-2005.bib)

- Dat Quoc Nguyen, Dai Quoc Nguyen, Dang Duc Pham and Son Bao Pham. [A Robust Transformation-Based Learning Approach Using Ripple Down Rules for Part-Of-Speech Tagging](http://content.iospress.com/articles/ai-communications/aic698). *AI Communications* (AICom), vol. 29, no. 3, pp. 409-422, 2016. [[.PDF]](http://arxiv.org/pdf/1412.4021.pdf) [[.bib]](http://rdrpostagger.sourceforge.net/AICom.bib)


## Installation

Installation can easily be done as follows.

install.packages("rJava") install.packages("data.table") install.packages("RDRPOSTagger", repos = "http://www.datatailor.be/rcube", type = "source")


Or with devtools

devtools::install_github("bnosac/RDRPOSTagger", build_vignettes = TRUE)


More details in the package documentation and package vignette

vignette("rdrpostagger-overview", package = "RDRPOSTagger") ```

License

The package is licensed under the GPL-3 license as described at http://www.gnu.org/licenses/gpl-3.0.html.

Support in text mining

Need support in text mining. Contact BNOSAC: http://www.bnosac.be



bnosac/RDRPOSTagger documentation built on May 8, 2019, 3:43 p.m.