calculate_score_topic: Calculate scores of a self-chosen topic

Description Usage Arguments Details Value See Also

View source: R/self_made_topics.R

Description

Calculate score of a self-chosen topic for each abstract to identify abstracts possibly corresponding to the topic of interest.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
calculate_score_topic(
  df,
  keywords,
  case = FALSE,
  col.score = "topic_score",
  col.indicate = NULL,
  threshold = NULL,
  discard = FALSE,
  col.abstract = Abstract
)

Arguments

df

Data frame containing abstracts.

keywords

Character vector. Vector containing keywords. The score is calculated based on these keywords. How much weight a keyword in keywords carries is determined by how often it is present in keywords, e.g. if a keyword is mentioned twice in keywords and it is mentioned only once in an abstract, it adds 2 points to the score.

case

Boolean. If case = TRUE, terms contained in keywords are case sensitive. If case = FALSE, terms contained in keywords are case insensitive.

col.score

String. Name of topic_score column.

col.indicate

String. Optional. Name of indicating column. If a string is provided, an extra column is added to df, indicating if the abstract corresponds to the topic of interest by "Yes" or "No".

threshold

Integer. Optional. Threshold to decide if abstract corresponds to topic of interest. If col.topic is specified or discard = TRUE without threshold being specified, threshold is automatically set to 1.

discard

Boolean. If discard = TRUE, only abstracts are kept that correspond to the topic of interest.

col.abstract

Symbol. Column containing abstracts.

Details

Calculate score of a self-chosen topic for each abstract to identify abstracts possibly corresponding to the topic of interest. This score is added to the data frame as an additional column, usually called topic_score, containing the calculated topic score. If there is more than one topic of interest, the column topic_score should be appropriately renamed. To decide which abstracts are considered to correspond to the topic of interest, a threshold can be set via the threshold argument. Furthermore, an additional column can be added, verbally indicating if the abstract corresponds to the topic. Choosing the right threshold can be facilitated using plot_score_topic().

Value

Data frame with calculated topic scores. If discard = FALSE, adds extra columns to the original data frame with the calculated topic scores. If discard = TRUE, only abstracts corresponding to the topic of interest are kept.

See Also

assign_topic(), plot_score_topic()

Other score functions: assign_topic(), calculate_score_animals(), calculate_score_biomarker(), calculate_score_patients(), plot_score_animals(), plot_score_biomarker(), plot_score_patients(), plot_score_topic()


JulFriedrich/miRetrieve documentation built on Sept. 20, 2021, 11:37 p.m.