knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

rmorphodita

R-CMD-check Codecov test coverage

The goal of rmorphodita is to enable morphological analysis, tagging and generation using MorphoDiTa's Python bindings (contained in the ufal.morphodita Python package).

Installation

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("skvrnami/rmorphodita")

Example

First you need to install morphodita by running install_morphodita().

library(rmorphodita)
install_morphodita()

Then you need to download a language model to use for tagging etc. There are three languages available: Czech (CZ), Slovak (SK), and English (EN). The download_models function downloads a .zip file with models from LINDAT/CLARIAH-CZ repository to a specified directory, unzips them and returns list of files with morphological taggers and dictionaries.

cz_models <- download_models(lang = "CZ", dest_folder = "tmp")
cz_models

Then it is necessary to load tagger:

cz_tagger <- load_tagger(cz_models[8])
tagged_text <- morpho_tag(cz_tagger, "Já bych všechny ty počítače zakázala.", NULL)
tagged_text

Function morpho_analyze returns all possible forms of a word.

morpho_analyze(cz_tagger, "kout")

And function morpho_generate returns all possible forms of a given lemma that complies with the specified wildcard. In the case below, it returns all nouns in second case.

morpho_generate(cz_tagger, "kout", tag_wildcard = "N???2?")

As the tags are quite unintelligible, it is possible to extract and recode them like this. The extract_hm_tags function splits the tag into columns indicating particular grammatical categories such as part of speech (pos), gender, number, case etc. The recode_tags function then recode the tag marks into factor with a full description of the tag meaning (using the TAGS list which stores the meaning of the tag values).

tagged_text %>%
    extract_hm_tags() %>%
    recode_tags(., tags_df = TAGS)
unlink("tmp", recursive = TRUE)


skvrnami/rmorphodita documentation built on Dec. 23, 2021, 3:24 a.m.