R/detect_racialethnic_terms.R

Defines functions detect_racialethnic_terms

Documented in detect_racialethnic_terms

#' Detect racial/ethnic terms in unstructured text data 
#'
#' This function detects racial and ethnic terms in unstructured text data. The input 
#' will be a character vector of text data such as a biomedical abstract, a Twitter bio, 
#' or a chapter from a novel. The output column will provide the number of racial or 
#' ethnic terms detected in the entry. 
#'
#' @param data A data frame or data frame extension (e.g. a tibble).
#' @param id A numeric or character vector unique to each entry.
#' @param input Character vector of text data for racial/ethnic terms to be detected.
#'
#' @examples
#'
#' library(tidyverse)
#' library(diverstidy)
#' data(pubmed_data)
#'
#' detected_terms <- pubmed_data %>%
#'   detect_racialethnic_terms(fk_pmid, abstract)
#'   
#' @export
detect_racialethnic_terms <- function(data, id, input){
  id <- dplyr::enquo(id)
  input <- dplyr::enquo(input)
  data <- data %>% 
    diverstidy::funnel_match(!!id, !!input, racial_ethnic, "race_ethnicity")
  data 
}
brandonleekramer/diverstidy documentation built on Dec. 19, 2021, 11:42 a.m.