knitr::opts_chunk$set( comment = "#>", tidy = FALSE, error = FALSE)
Detect the Language of Text
Franc has no external dependencies and supports 310 languages; all languages spoken by more than one million speakers. Franc is a port of the JavaScript project of the same name, see https://github.com/wooorm/franc.
devtools::install_github("mangothecat/franc")
library(franc)
Simply supply the text, and franc detects its language:
franc("Alle menslike wesens word vry") franc("এটি একটি ভাষা একক IBM স্ক্রিপ্ট") franc("Alle mennesker er født frie og") head(franc_all("O Brasil caiu 26 posições"))
und
is the undefined
language, this is returned if the input is
too short (shorter than 10 characters by default).
franc("the") franc("the", min_length = 3)
You can provide a whitelist or a blacklist:
franc_all("O Brasil caiu 26 posições", whitelist = c("por", "src", "glg", "spa")) head(franc_all("O Brasil caiu 26 posições", blacklist = c("src", "glg", "lav")))
The R version of franc supports 310 languages. By default only the
languages with more than 1 million speakers are used, this is 175
languages. The min_speakers
argument can relax this, and allows
using more languages:
head(franc_all("O Brasil caiu 26 posições")) head(franc_all("O Brasil caiu 26 posições", min_speakers = 0))
MIT © Mango Solutions, Titus Wormer, Maciej Ceglowski, Jacob R. Rideout and Kent S. Johnson.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.