knitr::opts_chunk$set(
  comment = "#>",
  collapse = TRUE,
  warning = FALSE,
  message = FALSE
)

charlatan is a wee bit complex. This vignette aims to help you contribute to the package. For a general introduction on contributing to rOpenSci packages see our Contributing guide.

Communication

Open an issue if you want to add a new provider or locale to an existing provider; it helps make sure there's no duplicated effort and we can help make sure you have the knowledge you need.

Let's continue with some definitions.

Definitions

For the purposes of this package:

A Localized provider is a provider specific for that locale: f.e. PhoneNumberProvider_en_US. A Parent provider is a provider that is inherited by the Localized providers: f.e. PhoneNumberProvider.

We have made these terms bold in this vignette. We hope the following examples makes this a bit more clear.

Example

There are Providers without locales, like CurrencyProvider.

And there are providers with locales: there is a Parent Provider AddressProvider, you cannot use that one without a locale, but you can use its Localized provider AddressProvider_en_US. The locale is en_US.

R6

If you aren't familiar with R6, have a look at the R6 website, in particular the introductory vignette.

Inheritance

At the heart of charlatan is the BareProvider, this class has all the basic number and text substitution that is used throughout the the package.

All non-locale providers inherit directly from the BareProvider: NumericsProvider inherits from BareProvider

For all providers with locales, we have some added logic for locales in the BaseProvider.

Locale specific inheritance

All providers with locales inherit from a common provider (Parent Provider), for example the English (United States) AddressProvider (AddressProvider_en_US) inherits from AddressProvider, which inherits from BaseProvider, which inherits from BareProvider:

BareProvider > BaseProvider > AddressProvider > AddressProvider_en_US

With inheritance we can define common functionality that works for most locales, but have the ability to overwrite functionality so that it works for that specific use-case.

For example:

library(charlatan)
set.seed(2000)
en <- PersonProvider_en_US$new() # English
jp <- PersonProvider_ja_JP$new() # Japanese
en$first_name() # Georgia
jp$first_name() # Haruka
jp$first_kana_name() #  カオリ
jp$last_kana_name() # コイズミ

Adding new providers or locales

Yes we welcome new contributions. Look in the github issues or scratch your own itch.

Adding a new locale step by step

Yes we welcome new locales for existing Providers!

Here is what we want to see in the Pull Request:

Code work:

Documentation work:

We want to have great documentation for this package and that means some work for you. - if you overwrite a method from the Parent provider you have to add a docstring: #' @description what the thing does - if you add information under public you still have to document that field with a docstring #' @field name_of_field description of the thing - if you add new functionality, provide an example under #' @examples above the code - run make doc or Rscript --no-init-file -e "library(methods); devtools::run_examples()" in the terminal. And make sure there are no warnings or errors.

Example

Here we add a new locale to loremIpsumProvider. The loremIpsumProvider generates random words, letters and paragraphs in a language to be used as placeholder text.

We add the language Klingon (locale: tlh) to this Provider.

lorem_word_list_tlh <- c("'Igh'aDmegh", "DIron", "Da'lar","moQbID")


#' Lorem provider for Klingon (Klingon)
#'
#' Methods for Lorem Ipsum generation 
#' Lorem Ipsum is a placeholder text commonly used to demonstrate the visual
#' form of a document or a typeface without relying on meaningful content.
#' @family tlh
#' @export
#' @examples
#' x <- LoremProvider_tlh$new()
#' x$word()
#' x$words(3)
#' x$words(6)
#' x$sentence()
#' x$paragraph()
#' x$paragraphs(3)
#' cat(x$paragraphs(6), sep = "\n")
#' x$text(19)
#' x <- LoremProvider_tlh$new(word_connector = " --- ")
#' x$paragraph(4)
LoremProvider_tlh <- R6::R6Class(
  inherit = LoremProvider,
  "LoremProvider_tlh",
  public = list(
    #' @field locale (character) the locale
    locale = "tlh"
  ),
  private = list(
    word_list = lorem_word_list_tlh
  )
)

Github work:

Adding a new provider, step by step

Yes we are open to new providers, but we need a use case: is it something you want to use in your work for example?

Here is what we want to see in the Pull Request:

Code work:

Documentation work:

We want to have great documentation for this package and that means some work for you. - make sure the Providers are described. - all public fields and methods need a description. - add examples of functionality in the docs under #' @examples - run make doc or Rscript --no-init-file -e "library(methods); devtools::run_examples()" in the terminal. And make sure there are no warnings or errors.

Github work:

Guidelines for providers and locales

There are a few things we enforce in tests: - all Providers that inherit from BaseProvider are considered Parent providers: they should never be directly initialized - Localized providers inherit from Parent Providers and should work. - Localized providers need at least an en_US locale.

So PhoneNumberProvider should error on instantiation, but PhoneNumberProvider_en_US should work.

But not everything can be tested so here are some other requisites: - New providers should go in the available_providers list - New locales should be in the available_locales_list - Parent Providers should have locale = NULL.

Where should I add logic or data?

In general we put new logic and data close to where it is used. If you need something for one locale only, place it there. Are we re-using that logic for multiple locales of one Provider? Consider if the logic should go in the Parent Provider.

Data generally goes into the private component of the R6 class:

ProviderName<- R6::R6Class(
  "ProviderName",
  inherit = BareProvider,
  public = list(
   # add new functions here
    ),
  active = list(
  # this one is special, you probably don't need it
    ),
  private = list(
  # here is where you place data
  provider_ = 'ProviderName'
    )
)

Prior work and related



ropensci/charlatan documentation built on Oct. 24, 2023, 9:15 a.m.