getPubtatorMatches: Find Pubtator information on Pubmed abstracts

Description Usage Arguments Details Value Examples

Description

getPubtatorMatches finds selected Pubtator information that occurs at least twice in the title and in the abstract.

Usage

1
2
getPubtatorMatches(abstracts, info = c("Genes", "Diseases", "Mutations",
  "Chemicals", "Species"))

Arguments

abstracts

A data frame with a list of Pubmed abstracts, including: PMID, Title and Abstract

info

Check what PubTator information to use for list of symbols, including genes, diseases, mutations, chemicals and species

Details

This function requires the packages "pubmed.mineR" and "plyr".

Value

The function "getPubtatorMatches" returns a data frame with four columns: PMID, symbols, title, and sentences. If the regex encounters an error, it will return the string "*** REGEX ERROR ***". The column "symbols" represent the Pubtator info found in the astract or title. The symbols are separated by a pipe "|". The column "title" will show the title only if there are at least 2 symbols in it. The colum "sentences" will show all sentences from the abstract that have at least two occurrences of any symbol listed in the column "symbols". The sentences are separated by a pipe "|".

Examples

1
2
3
4
## getting Pubtator matches from abstracts extracted from Pubmed
pubtator <- getPubtatorMatches(abstracts,
                               info = c("genes"))
pubtator[1,]

andreysoares/nlpUtilityBelt documentation built on May 6, 2019, 8:57 p.m.