readSPIEGEL | R Documentation |
Reads the XML-files from the SPIEGEL corpus and seperates the text and meta data.
readSPIEGEL(path = getwd(), file = list.files(path = path, pattern = "*.xml$", full.names = FALSE, recursive = TRUE), do.meta = TRUE, do.text = TRUE)
path |
Character string with Path where the data files are. |
file |
Character string with names of the XML files. |
do.meta |
Logical: Should the algorithm collect meta data? |
do.text |
Logical: Should the algorithm collect text data? |
meta |
id date title year number page_start page_stop pagetitle shorttitle rubrik ressort dokumentmerkmal dachzeile abstract |
text |
Text (Paragraphenweise) |
metamult |
signature person koerperschaft company inkl. Kategorie(n) |
##---- Should be DIRECTLY executable !! ----
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.