Load the package
library(xmlAnnotate)
and load up some test data.
folder <- system.file("extdata", "fomc", package = "xmlAnnotate") dir(folder)
Extract the 'hedge' tags from the first file in that folder
f <- file.path(folder, "2004_03_2-1.xml") f ftags <- get_tagset(f)
and take a look
knitr::kable(ftags)
By default this function get hedge
tags only. So the call above is equivalent to
ftags <- get_tagset(f, nodes=c('hedge'))
We can have the note
tags too, by adding it
ftags2 <- get_tagset(f, nodes=c('hedge', 'note'))
which looks like
knitr::kable(ftags2)
And if we want these tags extracted from all the XML files in a folder
fftags <- get_tagsets(folder, nodes=c('hedge', 'note'))
This rowbinds the results from all the files it finds.
If we want to extract all tags but want to match the word
and note
tags to the hedge
tag based on their positions in the text
fftag <- get_tagset(f, nodes=c('hedge','word', 'note')) fftag2 <- match_nodes(fftag, match_x = "hedge", match_y = c("word","note"))
which gives you all word
and note
tags that fall into the span of the respective hedge
tags.
knitr::kable(fftag2)
This only works for \code{get_tagset} output meaning for data extracted from a single .xml file. For output from \code{get_tagsets} generated from multiple files you have to apply match_nodes after subsetting for each filename
fftags2 <- plyr::ddply(fftags,~file,match_nodes,match_x="hedge",match_y=c("word","note"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.