knitr::opts_chunk$set(
  collapse = TRUE,
  error = FALSE)
lang_output <- function(language, text) {
  writeLines(c(paste0("```", language),  text, "```"))
}
tree <- function(path, header = path) {
  paste1 <- function(a, b) {
    paste(rep_len(a, length(b)), b)
  }
  indent <- function(x, files) {
    paste0(if (files) "| " else "  ", x)
  }
  is_directory <- function(x) {
    unname(file.info(x)[, "isdir"])
  }
  sort_files <- function(x) {
    i <- grepl("^[A-Z]", x)
    c(x[i], x[!i])
  }
  prefix_file <- "|--="
  prefix_dir  <- "|-+="
  files <- sort_files(dir(path))
  files_full <- file.path(path, files)
  isdir <- is_directory(files_full)
  ret <- as.list(c(paste1(prefix_dir, files[isdir]),
                   paste1(prefix_file, files[!isdir])))
  files_full <- c(files_full[isdir], files_full[!isdir])
  isdir <- c(isdir[isdir], isdir[!isdir])
  n <- length(ret)
  if (n > 0) {
    ret[[n]] <- sub("|", "\\", ret[[n]], fixed = TRUE)
    tmp <- lapply(which(isdir), function(i)
      c(ret[[i]], indent(tree(files_full[[i]], NULL), !all(isdir))))
    ret[isdir] <- tmp
  }
  c(header, unlist(ret))
}

The traduire R package provides a wrapper around the i18next JavaScript library. It presents an alternative interface to R's built-in internationalisation functions, with a focus on the ability to change the target language within a session. Currently the package presents only a stripped down interface to the underlying library, though this may expand in future.

First, prepare a json file with your translations. For example, the included file examples/simple.json contains:

path <- system.file("examples/simple.json", package = "traduire",
                    mustWork = TRUE)
lang_output("json", readLines(path))

We can create a translator, setting the default language to English (en) as:

tr <- traduire::i18n(path, language = "en")
tr

With this object we can perform translations with the t method by passing in a key from within our translations:

tr$t("hello")

Specify the language argument to change language:

tr$t("hello", language = "fr")

Interpolation

String interpolation is done using a syntax very similar to glue (see the i18next documentation)

tr$t("interpolate", list(what = "i18next", how = "easy"),
     language = "en")
tr$t("interpolate", list(what = "i18next", how = "facile"),
     language = "fr")

Pluralisation

The example here is derived from a web API that we developed the package to support. We wanted to, as a service, validate incoming data and return information back to user about what to fix; if the data is missing one or more columns we will report back the columns that they are missing. This requires different translations for the singular case ("Data missing column X") and plural ("Data missing columns X, Y").

The translation file looks like:

path_validation <- system.file("examples/validation.json",
                               package = "traduire", mustWork = TRUE)
lang_output("json", readLines(path_validation))

where the _plural suffix is important for i18next for determining the string to return for a singular or plural case, and the count element determines if the string is singular or plural.

Then we can use this as:

tr <- traduire::i18n(path_validation)

Pluralisation of results is supported using keys that include _plural suffix (see the i18next documentation) and by passing a count argument in to the translation:

tr$t("nocols", list(missing = "A"), count = 1)
tr$t("nocols", list(missing = "A, B"), count = 2)

or, changing the language:

tr$t("nocols", list(missing = "A"), count = 1, language = "fr")
tr$t("nocols", list(missing = "A, B"), count = 2, language = "fr")

Fallback language

To illustrate this feature, we use a list of translations of Hello world! which includes many languages.

path_hello <- system.file("hello/inst/traduire.json", package = "traduire")

Most simply, if we want to fall back onto a single language for all translations, we can provide a fallback language as a string:

tr <- traduire::i18n(path_hello, fallback = "it")
tr$t("hello", language = "unknown")

Alternatively, a chain of languages to try can be provided:

tr <- traduire::i18n(path_hello, fallback = c("a", "b", "de"))
tr$t("hello", language = "unknown")

If you want to have different fallback languages for different target languages, provide a named list of mappings (each of which can be a scalar or vector of fallback languages as above):

tr <- traduire::i18n(path_hello, fallback = list(co = "fr", "default" = "en"))
tr$t("hello", language = "co")
tr$t("hello", language = "unknown")

Translating multiple keys in a block of text

The motivating use case we had was translating a json file for use in an upstream web application, so the text to translate might contain data like:

{
  "id": "area_scope",
  "label": "element_label",
  "type": "multiselect",
  "description": "element_description"
}

where the json contains a mix of elements to be internationalised (such as the values of label and description) and elements to be left as-is (such as the values of id and type). The snippet above is a simplified version of the full data where the values to translate might occur at any depth within the json.

To support this, the i18n object has a replace method, which performs string replacement of text wrapped in t_(...). So we rewrite our json:

string <- '{
  "id": "area_scope",
  "label": "t_(element_label)",
  "type": "multiselect",
  "description": "t_(element_description)"
}'

and we provide a set of translations:

translations <- '{
    "en": {
        "translation": {
            "element_label": "Country",
            "element_description": "Select your countries"
        }
    },
    "fr": {
        "translation": {
            "element_label": "Payes",
            "element_description": "Sélectionnez vos payes"
        }
    }
}'

We construct a translator object with these translations:

tr <- traduire::i18n(translations)

We can then use the replace method to translate all strings (wrapped here in writeLines to make it easier to read with all json quotes:

writeLines(tr$replace(string))

or, into French:

writeLines(tr$replace(string, language = "fr"))

Note that while the input text here is json, it could be anything at all, and will not be parsed as json.

Use within a package

We provide an optional workflow for using translations within a package, or some other piece of code where the translations will be fairly invasive to add, allowing you to write essentially:

traduire::t_(...)

and have all the ... arguments forwarded to the appropriate translator object. There are several details here:

To do this, we allow packages (or other similar code) to "register" a translator, like

traduire::translator_register(resources)

where resources is passed to traduire::i18n.

Here we show a complete example package that implements "hello-world-as-a-service" - i.e., a small web service that will reply with a version of "Hello world!" translated into the client's choice.

The full package is included as an example within traduire at system.file("hello", package = "traduire") and is

path_hello <- system.file("hello", package = "traduire", mustWork = TRUE)
lang_output("plain", tree(path_hello, "hello"))

Below is the code in hello.R, which can say rough translations of "hello world" in a number of languages:

content <- readLines(file.path(path_hello, "R", "hello.R"))
lang_output("r", content[!grepl("^#", content)])

Here,

The .onLoad function contains a call to traduire::translator_register which registers a translator database for the package. All calls to t_ that come from this package will use this registered translator.

Why would we want to do this? If we were using plumber to build an API we might want to allow the requests to come in with a header indicating the language. Our plumber api might look like:

content <- readLines(file.path(path_hello, "inst", "plumber.R"))
lang_output("r", content)

The first endpoint inspects the endpoint's req object to get the requested language, but the second gets it automagically. This can be understood by looking at the code used to run the API:

content <- readLines(file.path(path_hello, "R", "api.R"))
lang_output("r", content[!grepl("^#", content)])

So at the beginning of each api request we are calling traduire::translator_set_language, which affects only this package as a "preroute" hook and resetting this in the "postserialize" hook.

The full package is available at system.file("hello", package = "traduire"). If you run the API, it can be used like:

$ curl -H "Accept-Language: fr" http://localhost:8888

 -----
Salut le monde !
 ------
    \   ^__^
     \  (oo)\ ________
        (__)\         )\ /\
             ||------w|
             ||      ||
$ curl -H "Accept-Language: en" http://localhost:8888

 -----
Hello world!
 ------
    \   ^__^
     \  (oo)\ ________
        (__)\         )\ /\
             ||------w|
             ||      ||

$ curl -H "Accept-Language: ko" http://localhost:8888/hello/cat

 --------------
반갑다 세상아
 --------------
    \
      \
        \
            |\___/|
          ==) ^Y^ (==
            \  ^  /
             )=*=(
            /     \
            |     |
           /| | | |\
           \| | |_|/\
      jgs  //_// ___/
               \_)

Namespaces, and the structure of translation files

This section outlines how to write the translation (json) files, alongside a discussion of using namespaces. Consider again the first example:

path <- system.file("examples/simple.json", package = "traduire",
                    mustWork = TRUE)
lang_output("json", readLines(path))

In this format, the top level keys (en, fr) represent languages and the next level key (translation) which appears redundant represents a namespace. A translation set can have multiple namespaces, which can help with organising a large set of strings, and can be used to split the file up over smaller files that might be easier to work with (see below).

Below, we have file with two namespaces, common and login. These might represent strings used throughout the application and in a login component, for example.

path <- system.file("examples/namespaces.json", package = "traduire",
                    mustWork = TRUE)
lang_output("json", readLines(path))

When constructing the translator object we can provide a default namespace (it defaults to translation):

tr <- traduire::i18n(path, default_namespace = "common")

Keys that are provided without an explicit namespace, will be looked up in the default namespace:

tr$t("hello")

or provide a namespace when looking up keys:

tr$t("common:hello", language = "fr")
tr$t("login:username", language = "fr")

So far, this brings relatively little advantage as our file, while structured, is still going to end up really large as all the files end up in it. So we might want to break it up like so:

path <- system.file("examples/structured", package = "traduire",
                    mustWork = TRUE)
lang_output("plain", tree(path, "structured"))

where each file is orgnanised like:

lang_output("json", readLines(file.path(path, "en-login.json")))

to allow this, we need to load the files one by one into the translation object, rather than as a single resource bundle. To do this, we can use the add_resource_bundle method:

obj <- traduire::i18n(NULL)
obj$add_resource_bundle("en", "common", file.path(path, "en-common.json"))
obj$add_resource_bundle("en", "login", file.path(path, "en-login.json"))
obj$add_resource_bundle("fr", "common", file.path(path, "fr-common.json"))
obj$add_resource_bundle("fr", "login", file.path(path, "fr-login.json"))
obj$t("login:password", language = "fr")

This is clearly going to be error prone to do with a large number of translation files, though a loop could help:

obj <- traduire::i18n(NULL)
for (language in c("en", "fr")) {
  for (namespace in c("common", "login")) {
    bundle <- file.path(path, sprintf("%s-%s.json", language, namespace))
    obj$add_resource_bundle(language, namespace, bundle)
  }
}
obj$t("login:password", language = "fr")

An alternative is to pass in the pattern used to locate these files, though this approach works best if you also declare your namespaces and languages up front. The pattern uses glue's syntax, and the pattern must include placeholders language and namespace (and no others):

pattern <- file.path(path, "{language}-{namespace}.json")
obj <- traduire::i18n(NULL, resource_pattern = pattern,
                      languages = c("en", "fr"),
                      namespaces = c("common", "login"))
obj$t("login:password", language = "fr")


reside-ic/traduire documentation built on March 25, 2023, 8:21 a.m.