epm_parse | R Documentation |
Read a raw PubMed record, identify XML tags, extract information and cast it into a structured data.frame. The expected input is an XML-tag-decorated string corresponding to a single PubMed record. Information about article title, authors, affiliations, journal name and abbreviation, publication date, references, and keywords are returned.
epm_parse(
x,
max_authors = 10,
autofill_address = TRUE,
compact_output = TRUE,
include_abstract = TRUE,
max_references = 150,
ref_id_type = "doi",
verbose = TRUE
)
x |
An 'easyPubMed' object. The object must include raw records (n>0) downloaded in the 'xml' format. |
max_authors |
Numeric, maximum number of authors to retrieve. If this is set to -1, only the last author is extracted. If this is set to 1, only the first author is returned. If this is set to 2, the first and the last authors are extracted. If this is set to any other positive number (i), up to the leading (n-1) authors are retrieved together with the last author. If this is set to a number larger than the number of authors in a record, all authors are returned. Note that at least 1 author has to be retrieved, therefore a value of 0 is not accepted (coerced to -1). |
autofill_address |
Logical, shall author affiliations be propagated within each record to fill missing values. |
compact_output |
Logical, shall record data be returned in a compact format where each row is a single record and author names are collapsed together. If 'FALSE', each row corresponds to a single author of the publication and the record-specific data are recycled for all included authors (legacy approach). |
include_abstract |
Logical, shall abstract text be included in the output data.frame. If 'FALSE', the abstract text column is populated with a missing value. |
max_references |
Numeric, maximum number of references to return (for each PubMed record). |
ref_id_type |
String, must be one of the following values: ‘c(’pmid', 'doi')'. Type of identifier used to describe citation references. |
verbose |
Logical, shall details about the progress of the operation be printed to console. |
an easyPubMed object including a data.frame ('data' slot) that stores information extracted from its raw XML PubMed records.
Damiano Fantini, damiano.fantini@gmail.com
https://www.data-pulse.com/dev_site/easypubmed/
# Note: a time limit can be set in order to kill the operation when/if
# the NCBI/Entrez server becomes unresponsive.
setTimeLimit(elapsed = 4.9)
try({
x <- epm_query(query_string = 'Damiano Fantini[AU] AND "2018"[PDAT]')
x <- epm_fetch(x = x, format = 'xml')
x <- epm_parse(x, include_abstract = FALSE, max_authors = 1)
get_epm_data(x)
}, silent = TRUE)
setTimeLimit(elapsed = Inf)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.