readme.md

Importing Inception’s export into R

R-CMD-check

Introduction

Setup

Installation

remotes::install_github("nicoblokker/inception-to-r")

Load package and unzip XMI-files

library(inception2r)

[skip this step if XMI-files are already unzipped]

unzip_export(folder = "export", overwrite = FALSE, recursive = FALSE)     # CREATES LOCAL FILES; USE WITH CAUTION 
xmi_file <- list.files(".", pattern = "\\.xmi$", recursive = T)           # select only XMI-files
xmi_file
## [1] "tests/testthat/annotation/test_document.txt/extracted/demo.xmi"

Extract annotations

# extract custom annotations from file
df_custom <- xmi2df(xmi_file, key = "custom")
print(df_custom, n = 3)
## # A tibble: 3 × 9
##   id    sofa  begin end   label   layer         text  xmi_file_name        quote
##   <chr> <chr> <chr> <chr> <chr>   <chr>         <chr> <chr>                <chr>
## 1 4075  1     213   295   Label 1 SentenceLabel ""    tests/testthat/anno… " St…
## 2 4085  1     748   804   Label 2 SentenceLabel ""    tests/testthat/anno… " At…
## 3 4080  1     805   887   Label 1 SentenceLabel ""    tests/testthat/anno… " St…

Extract annotations from multiple documents (and namespaces)

# extract multiple layers
df_mult_layers <- xmi2df(xmi_file, key = c("custom", "Sentence")) 

# extract multiple files (and layers)
df_mult_files <- purrr::map_df(c(xmi_file, xmi_file), xmi2df, key = c("custom", "Sentence"), .id = "file")
select_ns(xmi_file)
##  [1] "cas"         "chunk"       "constituent" "custom"      "dependency" 
##  [6] "morph"       "pos"         "tcas"        "tweet"       "type"       
## [11] "type10"      "type11"      "type2"       "type3"       "type4"      
## [16] "type5"       "type6"       "type7"       "type8"       "type9"      
## [21] "xmi"

References

Klie, Jan-Christoph, Michael Bugert, Beto Boullosa, Richard Eckart de Castilho, and Iryna Gurevych. 2018. “The INCEpTION Platform: Machine-Assisted and Knowledge-Oriented Interactive Annotation.” In *Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations*, 5–9. Santa Fe, USA: Association for Computational Linguistics. .
Wickham, Hadley, Jim Hester, and Jeroen Ooms. 2021. *Xml2: Parse XML*. .

[^1]: Demo project used in this repository created by: https://morbo.ukp.informatik.tu-darmstadt.de/demo



nicoblokker/inception-to-r documentation built on June 4, 2023, 12:20 a.m.