inception2r
lets the user filter the exported files to extract
only desired variablesxml2
package
(Wickham, Hester, and Ooms 2021) to extract XML-elements from exported
filesinception2r
lets the user define namespaces
(features and layers) via simple XML-path languagedata.frame
containing only the queried
variables and corresponding text spansremotes::install_github("nicoblokker/inception-to-r")
library(inception2r)
[skip this step if XMI-files are already unzipped]
unzip_export
to unzip all (.zip) files in the specified
directory labeled as “annotation…” or “curation…” and place in new
folderrecursive = TRUE
to extend towards
sub-directories)unzip_export(folder = "export", overwrite = FALSE, recursive = FALSE) # CREATES LOCAL FILES; USE WITH CAUTION
xmi_file <- list.files(".", pattern = "\\.xmi$", recursive = T) # select only XMI-files
xmi_file
## [1] "tests/testthat/annotation/test_document.txt/extracted/demo.xmi"
xmi2df
function to extract annotations specified by the key
argument (defaults to “custom”)key
queries the string contained in namespaces (e.g., “custom”
matches “custom” AND “custom2”)# extract custom annotations from file
df_custom <- xmi2df(xmi_file, key = "custom")
print(df_custom, n = 3)
## # A tibble: 3 × 9
## id sofa begin end label layer text xmi_file_name quote
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 4075 1 213 295 Label 1 SentenceLabel "" tests/testthat/anno… " St…
## 2 4085 1 748 804 Label 2 SentenceLabel "" tests/testthat/anno… " At…
## 3 4080 1 805 887 Label 1 SentenceLabel "" tests/testthat/anno… " St…
file
) using
the purrr
package# extract multiple layers
df_mult_layers <- xmi2df(xmi_file, key = c("custom", "Sentence"))
# extract multiple files (and layers)
df_mult_files <- purrr::map_df(c(xmi_file, xmi_file), xmi2df, key = c("custom", "Sentence"), .id = "file")
select_ns
select_ns(xmi_file)
## [1] "cas" "chunk" "constituent" "custom" "dependency"
## [6] "morph" "pos" "tcas" "tweet" "type"
## [11] "type10" "type11" "type2" "type3" "type4"
## [16] "type5" "type6" "type7" "type8" "type9"
## [21] "xmi"
[^1]: Demo project used in this repository created by: https://morbo.ukp.informatik.tu-darmstadt.de/demo
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.