trickypdf-package: Package 'trickypdf'

Description Author(s) References

Description

Turn pdf document into XML for further processing in a corpus preparation pipeline. The particular focus of the package is to cleanly extract text from layouted pdf documents (multi-column layout etc.).

Author(s)

Andreas Blaette (andreas.blaette@uni-due.de)

References

http://polmine.sowi.uni-due.de


PolMine/trickypdf documentation built on Nov. 20, 2019, 8:01 p.m.