detect_language: Detect the language that the abstract or other fields are...

detect_languageR Documentation

Detect the language that the abstract or other fields are written in

Description

Detect the language that the abstract or other fields are written in

Usage

detect_language(
  CitDat,
  fieldsToDetectIn = c("Abstract"),
  wantedLanguage = c("english")
)

Arguments

CitDat

A dataframe/tibble possibly returned by read_Citavi_xlsx.

fieldsToDetectIn

Character vector with names of fields whose text language should be detected. Default is c("Abstract"). When multiple fields are given (e.g. c("Abstract", "Title")), they are combined into a single string whose language is then detected.

wantedLanguage

Character vector with names of languages that are desired. Default is c("english"). If not set to NULL, a new column det_lang_wanted is created, which is TRUE if the detected language in det_lang is a wanted language.

Details

[Experimental]
The underlying core function determining the language is textcat::textcat().

Value

A tibble containing at least one additional column: det_lang.

Examples

example_path <- example_file("3dupsin5refs/3dupsin5refs.ctv6")
read_Citavi_ctv6(example_path) %>%
  detect_language() %>%
  dplyr::select(Abstract, det_lang, det_lang_wanted)


SchmidtPaul/CitaviR documentation built on Jan. 31, 2023, 5 a.m.