Home

/

GitHub

/

dsidavis/SpilloverDA

/

doc2keywords: Document to Resolved keywords

doc2keywords: Document to Resolved keywords
In dsidavis/SpilloverDA: Munging and Exploring the Viral Spillover Data Set

Description Usage Arguments Details Value Author(s) Examples

Run the term extractor on a document

doc2keywords(doc.file, ecoextract = getEcoExtractPyScript(),
  results.dir = character(), results.file = file.path(results.dir,
  gsub("xml$", "rds", basename(doc.file))), cache.dir = character(),
  cache.file = file.path(cache.dir, gsub("xml$", "rds", basename(doc.file))),
  section.text = load_text(doc.file, cache.file, cache.dir))

`doc.file`	a file to parse, either XML or PDF
`ecoextract`	file path to the ecoextract.py script
`results.dir`	optional, directory to store the results as a rds file. If not specified, no results will be saved. If the directory does not currently exist, it will be created.
`results.file`	optional, file name to use for the results, defaults to the `doc.file` basename.rds
`cache.dir`	optional directory to cache the intermediate text results from `ReadPDF::getSectionText` If not specified, no caching will be performed
`cache.file`	optional, file name to use for the cached section text
`section.text`	a list, with one element per section to be processed

This function will run the term extractor (based on EpiTator https://github.com/ecohealthalliance/EpiTator) on a document. The document can be either XML generated by pdftohtml or a PDF document which will be internally converted to a XML document. Additionally, the raw text can also be provided. Results and intermediate text split by sections can be optionally saved.

a list, with one element per section with all resolved keywords arranged in a nested list.

Matt Espe and Duncan Temple Lang

1
2
3

txt = "This mentions China"
ans = doc2keywords(section.text = short_text)
getLocation(ans)

dsidavis/SpilloverDA documentation built on June 1, 2019, 2:55 p.m.

dsidavis/SpilloverDA index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

dsidavis/SpilloverDA
Munging and Exploring the Viral Spillover Data Set

doc2keywords: Document to Resolved keywords
In dsidavis/SpilloverDA: Munging and Exploring the Viral Spillover Data Set

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Related to doc2keywords in dsidavis/SpilloverDA...

R Package Documentation

Browse R Packages

We want your feedback!

dsidavis/SpilloverDA Munging and Exploring the Viral Spillover Data Set

doc2keywords: Document to Resolved keywords In dsidavis/SpilloverDA: Munging and Exploring the Viral Spillover Data Set

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Related to doc2keywords in dsidavis/SpilloverDA...

R Package Documentation

Browse R Packages

We want your feedback!

dsidavis/SpilloverDA
Munging and Exploring the Viral Spillover Data Set

doc2keywords: Document to Resolved keywords
In dsidavis/SpilloverDA: Munging and Exploring the Viral Spillover Data Set