This function takes a .xml documents from a corpus of forum posts and returns a vector of strings. Can perhaps be used for other forum corpora which have a similar structure
1 | processXMLstring(pathToFolder, minMaxxWordCount = 300)
|
pathToFolder, |
the path to the folder containing the corpus |
minMaxWordCount, |
no documents with less tokens than indicated will be accepted and all documents longer than the spefified count will be cropped Defaults to 300 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.