getSupp: Get supplmentary tables

Description Usage Arguments Details Value Note Author(s) See Also Examples

View source: R/getSupp.R

Description

Get supplementary tables from PMC

Usage

1
getSupp(pmcid, file, type, opts = "-raw -nopgbrk", rm = TRUE, header = TRUE, pmc=TRUE, ...)

Arguments

pmcid

id or XML document with PMC id attribute

file

supplementary file name

type

type of file, default to file extension

opts

pdftotext options

rm

remove downloaded files

header

column labels in Word doc

pmc

download from NCBI pmc

...

other options passed to read commands

Details

This function is used by pmcSupp to read supplmentary tables in a variety of formats including Excel, Word, HTML, PDF, text, and zip. If pmc=TRUE, the url string is generated using the PMC id and the file name http://www.ncbi.nlm.nih.gov/pmc/articles/<pmcid>/bin/<supplementary file>. If pmc=FALSE, then include the full url string in the file name.

Value

A data.frame or vector for PDF files.

Note

May not work on all systems. Zip files are uncompressed using the unix unzip command. Excel files are read using the read.xls function in the gdata package. Microsoft Word documents are converted to html files using the Universal Office Converter unoconv and then tables within the html files are read using readHTMLtable in the XML package. The tables within HTML files are also loaded using readHTMLtable. PDF files are converted to text using the unix script pdftotext and the resulting file is read into R using readLines.

Author(s)

Chris Stubben

See Also

pmcSupp

Examples

1
2
3
4
5
6
7
8
## Not run: 
doc <- pmcOAI("PMC2231364" )
pmcSupp(doc)  # list files
# pmcSupp(doc, 3)  # OR
s2 <- getSupp(doc, "1471-2180-7-96-S3.pdf" )
s2

## End(Not run)

cstubben/pmcXML documentation built on May 14, 2019, 12:25 p.m.