extract_mir_df: Extract miRNA names from abstracts in data frame

Description Usage Arguments Details Value See Also

Description

Extract miRNA names from abstracts in a data frame.

Usage

1
2
3
4
5
6
extract_mir_df(
  df,
  threshold = 1,
  col.abstract = Abstract,
  extract_letters = FALSE
)

Arguments

df

Data frame containing abstracts.

threshold

Integer. Specifies how often a miRNA must be mentioned in an abstract to be extracted.

col.abstract

Symbol. Column containing abstracts.

extract_letters

Boolean. If extract_letters = FALSE, only the miRNA stem is extracted (e.g. miR-23). If extract_letters = TRUE, the miRNA stem with trailing letter (e.g. miR-23a) is extracted.

Details

Extract miRNA names from abstracts in a data frame. miRNA names can either be extracted with their stem only, e.g. miR-23, or with their trailing letter, e.g. miR-23a. miRNA names are adapted to the most recent miRBase version (e.g. miR-97, miR-102, miR-180(a/b) become miR-30a, miR-29a, and miR-172(a/b), respectively). Additionally, how often a miRNA must be mentioned in an abstract to be extracted can be regulated via the threshold argument. Ultimately, abstracts not containing any miRNA names are silently dropped. As many abstracts do not adhere to the miRNA nomenclature, it is recommended to extract only the miRNA stem with extract_letters = FALSE.

Value

Data frame with miRNA names extracted from abstracts.

See Also

extract_mir_string()

Other extract functions: extract_mir_string(), extract_snp()


JulFriedrich/miRetrieve documentation built on Sept. 20, 2021, 11:37 p.m.