compare_mir_terms_unique: Compare terms uniquely associated with a miRNA name

Description Usage Arguments Details Value See Also

View source: R/compare_mir_terms.R

Description

Compare terms uniquely associated with a miRNA name over topics.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
compare_mir_terms_unique(
  df,
  mir,
  top = 20,
  token = "words",
  ...,
  topic = NULL,
  stopwords = stopwords_miretrieve,
  stopwords_ngram = TRUE,
  normalize = TRUE,
  colour = "steelblue3",
  col.mir = miRNA,
  col.abstract = Abstract,
  col.topic = Topic,
  col.pmid = PMID,
  title = NULL
)

Arguments

df

Data frame containing miRNA names, abstracts, topics, and PubMed-IDs.

mir

String. miRNA name of interest.

top

Integer. Number of top terms to plot.

token

String. Specifies how abstracts shall be split up. Taken from unnest_tokens() in the tidytext package: "Unit for tokenizing, or a custom tokenizing function. Built-in options are "words" (default), "characters", "character_shingles", "ngrams", "skip_ngrams", "sentences", "lines", "paragraphs", "regex", (...), and "ptb" (Penn Treebank). If a function, should take a character vector and return a list of character vectors of the same length."

...

Additional arguments for tokenization, if necessary.

topic

Character vector. Optional. Specifies which topics to plot. If topic = NULL, all topics in df are plotted.

stopwords

Data frame containing stop words.

stopwords_ngram

Boolean. Specifies if stop words shall be removed from abstracts when using ngrams. Only applied when token = 'ngrams'.

normalize

Boolean. If normalize = TRUE, relative term frequency is plotted, denoting the relative number of papers with mir mentioning the term compared to all papers with mir mentioning the term. If normalize = FALSE, absolute term frequency is plotted, denoting the number of papers with mir the term is mentioned in.

colour

String. Colour of bar plot.

col.mir

Symbol. Column containing miRNAs.

col.abstract

Symbol. Column containing abstracts.

col.topic

Symbol. Column containing topics names.

col.pmid

Symbol. Column containing PubMed-IDs.

title

String. Plot title.

Details

Compare terms uniquely associated with a miRNA name over topics. miRNA names and topics must be in a data frame df, while terms are taken from abstracts contained in df. Number of top terms to choose is regulated by top. Terms are evaluated either as the number of times they are mentioned in all abstracts with the miRNA name of interest, or the number of times they are relatively mentioned compared to all abstracts with the miRNA name of interest. compare_mir_terms_unique() is based on the tools available in the tidytext package.

Value

Bar plot containing unique miRNA-terms associations per topic.

See Also

compare_mir_terms(), compare_mir_terms_log2(), compare_mir_terms_scatter()

Other compare functions: compare_mir_count_log2(), compare_mir_count_unique(), compare_mir_count(), compare_mir_terms_log2(), compare_mir_terms_scatter(), compare_mir_terms()


JulFriedrich/miRetrieve documentation built on Sept. 20, 2021, 11:37 p.m.