table_articles_byAuth: Extract Publication and Affiliation Data from PubMed Records

Description Usage Arguments Details Value Author(s) References Examples

Description

Extract Publication Info from PubMed records and cast data into a data.frame where each row corresponds to a different author. It is possible to limit data extraction to first authors or last authors only, or get information about all authors of each PubMed record.

Usage

1
2
3
4
5
6
7
table_articles_byAuth(pubmed_data, 
                             included_authors = "all", 
                             max_chars = 500, 
                             autofill = TRUE, 
                             dest_file = NULL, 
                             getKeywords = TRUE, 
                             encoding = "UTF8")

Arguments

pubmed_data

PubMed Data in XML format: typically, an XML file resulting from a batch_pubmed_download() call or an XML object, result of a fetch_pubmed_data() call.

included_authors

Character: c("first", "last", "all"). Only includes information from the first, the last or all authors of a PubMed record.

max_chars

Numeric: maximum number of chars to extract from the AbstractText field.

autofill

Logical. If TRUE, missing affiliations are imputed according to the available values (from the same article).

dest_file

String (character of length 1). Name of the file that will be written for storing the output. If NULL, no file will be saved.

getKeywords

Logical. If TRUE, the operation will attempt to extract PubMed record keywords (MESH topics, keywords).

encoding

The encoding of an input/output connection can be specified by name (for example, "ASCII", or "UTF-8", in the same way as it would be given to the function base::iconv(). See iconv() help page for how to find out more about encodings that can be used on your platform. Here, we recommend using "UTF-8".

Details

Retrieve publication and author information from PubMed data, and cast them as a data.frame.

Value

Data frame including the following fields: c("article.title","article.abstract", "date.year", "date.month", "date.day", "journal.abbrv", "journal.title", "keywords", "auth.last", "auth.fore", "auth.address", "auth.email").

Author(s)

Damiano Fantini damiano.fantini@gmail.com

References

https://www.data-pulse.com/dev_site/easypubmed/

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
## Not run: 
## Cast PubMed record info into a data.frame

dami_query <- "Damiano Fantini[AU]"
dami_on_pubmed <- get_pubmed_ids(dami_query)
dami_abstracts_xml <- fetch_pubmed_data(dami_on_pubmed, encoding = "ASCII")
xx <- table_articles_byAuth(pubmed_data = dami_abstracts_xml, 
                            included_authors = "first", 
                            max_chars = 100, 
                            autofill = TRUE)

print(xx[1:5, c("pmid", "lastname", "jabbrv")])
#
## Download records first
## Also, auto-fill disabled
dami_query <- "Damiano Fantini[AU]"
curr.file <- batch_pubmed_download(dami_query, dest_file_prefix = "test_bpd_", encoding = "ASCII")
xx <- table_articles_byAuth(pubmed_data = curr.file[1], 
                            included_authors = "all", 
                            max_chars = 20, 
                            autofill = FALSE)
print(xx[1:5, c("pmid", "lastname", "jabbrv")])


## End(Not run)

easyPubMed documentation built on May 2, 2019, 3:47 p.m.