README.md

unimorphR

unimorphR provides an R wrapper around the Python package unimorph, a command line interface to the UniMorph project. This essentially includes three functionalities: downloading morphological paradigm data from the UniMorph project; getting the features and lemma of an already inflected word; and returning the inflected forms of a given lemma.

Installation

You can install the development version of unimorphR with:

devtools::install_github("b05102139/unimorphR")

Manually downloading data

Data in the UniMorph project can be manually downloaded thus:

library(unimorphR)
download_unimorph("fra")

This data can then be loaded into R:

french_paradigms <- load_dataset("fra")
french_paradigms[87:89,]
#>     lemma  form        features
#> 87 abader abadé      V.PTCP;PST
#> 88 abader abade V;SBJV;PRS;1;SG
#> 89 abader abade V;SBJV;PRS;3;SG

Analyzing Inflected Words

To analyze an inflected word, analyze_word is the function to use, where the word and language must be specified:

library(unimorphR)
analyze_word("fought", lang="eng")
#>   lemma inflected     features
#> 1 fight    fought        V;PST
#> 2 fight    fought V;V.PTCP;PST

Inflecting Lemmas

And to return the forms of a given lemma, use inflect_word:

inflect_word("attack", lang="eng")
#>    lemma inflected     features
#> 1 attack  attacked        V;PST
#> 2 attack  attacked V;V.PTCP;PST
#> 3 attack attacking V;V.PTCP;PRS
#> 4 attack   attacks   V;3;SG;PRS
#> 5 attack    attack       V;NFIN

It is also possible to specify a specific form to return, based on the features desired:

inflect_word("attack", lang="eng", features = "V;V.PTCP;PRS")
#>    lemma inflected     features
#> 1 attack attacking V;V.PTCP;PRS

Please refer to the original UniMorph project for more details (and please cite them if you use this code in a paper)!



b05102139/unimorphR documentation built on Dec. 19, 2021, 6:38 a.m.