Home

/

GitHub

/

ropensci/tidypmc

/

separate_genes: Separate genes and operons into multiple rows

separate_genes: Separate genes and operons into multiple rows
In ropensci/tidypmc: Parse Full Text XML Documents from PubMed Central

Description Usage Arguments Value Note Author(s) Examples

View source: R/separate_genes.R

Separate genes and operons mentioned in full text into multiple rows

1 2	separate_genes(txt, pattern = "\\b[A-Za-z][a-z]{2}[A-Z0-9]+\\b", genes, operon = 6, column = "text")

`txt`	a table
`pattern`	regular expression to match genes, default is to match microbial genes like AbcD, default [A-Za-z][a-z]2[A-Z0-9]+
`genes`	an optional vector of genes, set pattern to NA to only match this list.
`operon`	operon length, default 6. Split genes with 6 or more letters into separate genes, for example AbcDEF is split into abcD, abcE and abcF.
`column`	column name to search, default "text"

a tibble with gene name, matching text and rows.

Check for genes in italics using xml_text(xml_find_all(doc, "//sec//p//italic")) and update the pattern or add additional genes as an optional vector if needed

Chris Stubben

1
2
3

x <- data.frame(row = 1, text = "Genes like YacK, hmu and sufABC")
separate_genes(x)
separate_genes(x, genes = "hmu")

ropensci/tidypmc documentation built on Dec. 14, 2019, 11:42 p.m.

ropensci/tidypmc index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ropensci/tidypmc
Parse Full Text XML Documents from PubMed Central

separate_genes: Separate genes and operons into multiple rows
In ropensci/tidypmc: Parse Full Text XML Documents from PubMed Central

Description

Usage

Arguments

Value

Note

Author(s)

Examples

Related to separate_genes in ropensci/tidypmc...

R Package Documentation

Browse R Packages

We want your feedback!

ropensci/tidypmc Parse Full Text XML Documents from PubMed Central

separate_genes: Separate genes and operons into multiple rows In ropensci/tidypmc: Parse Full Text XML Documents from PubMed Central

Description

Usage

Arguments

Value

Note

Author(s)

Examples

Related to separate_genes in ropensci/tidypmc...

R Package Documentation

Browse R Packages

We want your feedback!

ropensci/tidypmc
Parse Full Text XML Documents from PubMed Central

separate_genes: Separate genes and operons into multiple rows
In ropensci/tidypmc: Parse Full Text XML Documents from PubMed Central