docxtractr: Extract Data Tables and Comments from 'Microsoft' 'Word' Documents

'Microsoft Word' 'docx' files provide an 'XML' structure that is fairly straightforward to navigate, especially when it applies to 'Word' tables and comments. Tools are provided to determine table count/structure, comment count and also to extract/clean tables and comments from 'Microsoft Word' 'docx' documents. There is also nascent support for '.doc' and '.pptx' files.

Getting started

Package details

AuthorBob Rudis [aut, cre] (<https://orcid.org/0000-0001-5670-2640>), Mark Dulhunty [ctb], Karlo Guidoni-Martins [ctb], Chris Muir [aut, ctb], John Muschelli [ctb]
MaintainerBob Rudis <bob@rud.is>
LicenseMIT + file LICENSE
Version0.6.5
URL http://gitlab.com/hrbrmstr/docxtractr
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:
install.packages("docxtractr")

Try the docxtractr package in your browser

Any scripts or data that you put into this service are public.

docxtractr documentation built on July 8, 2020, 6:23 p.m.