findTags: Find locus tags

Description Usage Arguments Details Value Note Author(s) See Also Examples

View source: R/findTags.R

Description

Find and extract locus tags in PMC text or tables

Usage

1
findTags(txt, tags, prefix, suffix, notStartingWith, expand = TRUE, digits = 4, ...)

Arguments

txt

output from pmcText or pmcTable

tags

an ordered list of locus tags, used for expanding locus tag pairs

prefix

locus tag prefix, can be a regular expression such as "BPS[SL]" or "VCA?"

suffix

locus tag suffix, should be a single letter "a" or character class "[ac]" or grouping brackets "(a|c|\.1)"

notStartingWith

optional single letter to exclude matches, eg, use "J" to match "HP" but not "JHP" tags in Helicobacter

expand

expand locus tags pairs marking the start and end of a operon, island or other region

digits

number of digits in locus tags, use NA for 1 or more

...

other options passed to searchPMC

Details

Searches for locus tags in text and tables using searchPMC and extracts locus tags using parseTags and expands tag pairs using seqIds. The prefix, digits and suffix options are used to build the pattern string "YPO[0-9]4a?" where prefix="YPO" and digits=4 and optional suffix="a". The notStartingWith option is used to add a negative lookbehind "(?<!J)HP[0-9]4" to avoid tags starting with a given letter.

Value

A data.frame with locus tag, section title or table name and sentence or table row containing the mention

Note

Matches tag pairs including YPO1774-YPO1779 OR YPO1774 to YPO1779 OR YPO1774-1779 OR YPO1774-9. Some matches may include interaction pairs and other non-ranges and therefore range expansions should be checked (or set expand=FALSE to skip)

Author(s)

Chris Stubben

See Also

parseTags

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Not run: 
doc <- pmcOAI("PMC2231364" )
data(yplocus)
# text  - 33 tags
txt <- pmcText(doc)
y <- findTags(txt, yplocus, "YPO", "a")
head(y)
table2(y$range)   ## within range
table2(y$mention) # check range expansions
subset(y, locus == "YPO0988")
# or tables
x <- pmcTable(doc)
y <- findTags(x[[2]], yplocus, "YPO", "a")

## End(Not run)

cstubben/pmcXML documentation built on May 14, 2019, 12:25 p.m.