knitr::opts_chunk$set( comment = "#>", collapse = TRUE, warning = FALSE, message = FALSE )
charlatan
is a wee bit complex. This vignette aims to help you contribute
to the package. For a general introduction on contributing to rOpenSci packages
see our Contributing guide.
Let's start with some definitions.
For the purposes of this package:
charlatan
. For example,
we have providers for phone numbers, addresses and people's names. Adding a provider
may involve a single file, more than one file; and a single R6 class or many
R6 classes.en-US
, en-GB
). Some fakers won't have any locales,
whereas others can have many.If you aren't familiar with R6, have a look at the R6 website, in particular the introductory vignette.
Open an issue if you want to add a new provider or locale to an existing provider; it helps make sure there's no duplicated effort and we can help make sure you have the knowledge you need.
Providers are generally first created by making an R6 class. Let's start with a
heavily simplified base R6 class that defines some utility methods. We
call it BaseProvider
in charlatan
, but here we'll call it MyBaseProvider
to avoid confusion.
library(R6) MyBaseProvider <- R6::R6Class( "MyBaseProvider", public = list( random_element = function(x) { if (length(x) == 0) { return("") } if (inherits(x, "character")) if (!any(nzchar(x))) { return("") } x[sample.int(n = length(x), size = 1)] }, random_int = function(min = 0, max = 9999, size = 1) { stopifnot(max >= min) num <- max - min + 1 sample.int(n = num, size = size, replace = TRUE) + (min - 1) } ) )
If you don't need to handle locales it becomes simpler:
FooBar <- R6::R6Class( "FooBar", inherit = charlatan::BaseProvider, public = list( integer = function(n = 1, min = 1, max = 1000) { super$random_int(min, max, n) } ) )
We can create an instance of the FooBar
class by calling $new()
on it.
It only has one method integer()
, which we can call to get a random
integer.
x <- FooBar$new() x x$integer()
If your provider will need to handle different locales, it gets a bit more complex. In the Python library faker from which this package draws inspiration, you can create separate folders for each provider within the Python library.
However, R doesn't allow this, so instead we categorize different locales for each provider within the file names. For example, for the address provider we have files in the package:
Where the latter two provides specific data for each locale, and the first
file has the AddressProvider
class that pulls in the locale specific data.
Here, we'll create a very simplified AddressProvider
class using an
example locale file.
library(charlatan) file <- system.file("examples", "address-provider-en_US.R", package = "charlatan") source(file) MyAddressProvider <- R6::R6Class( inherit = MyBaseProvider, "MyAddressProvider", lock_objects = FALSE, public = list( locale = NULL, city_suffixes = NULL, initialize = function() { self$locale <- "en_us" self$city_suffixes <- eval(parse(text = paste0("city_suffixes_", self$locale))) }, city_suffix = function() { super$random_element(self$city_suffixes) } ) )
We can create an instance of the MyAddressProvider
class by calling $new()
on it.
It only has one method city_suffix()
, which we can call to get a random
city suffix.
x <- MyAddressProvider$new() x x$city_suffix()
When you want to add a new locale to an existing provider, look in the R/
folder
of the package and the locales that are available are in the file names.
Pick one of the locale files for the provider you're extending, make a duplicate of it and rename the file with your new locale. Then modify the duplicate, copying the format but putting in place the appropriate information for the new locale.
Where the data comes from for the new locale may vary. One easy way to start may be
porting over locales in the faker Python library that are not yet in charlatan
.
If it's a locale for which you can't easily port over from another library, you need to get the data from a variety of sources. There are some R based packages that should help:
Keep in mind when using data to look at their license, if any, and any implications with respect to whether it can be used in this package.
It's a little tricky how this is done. In the initialize()
block of each main provider
file (e.g., address-provider.R
) we pull in the appropriate locale specific data
based on the user input locale. For example, here's an abbreviated initialize
block from
the AddressProvider
:
initialize = function(locale = NULL) { if (!is.null(locale)) { # check global locales super$check_locale(locale) # check address provider locales check_locale_(locale, address_provider_locales) self$locale <- locale } else { self$locale <- 'en_US' } self$city_prefixes <- parse_eval("city_prefixes_", self$locale) }
A few things to note:
en_US
parse_eval()
to pull in the data. Essentially, parse_eval()
makes the
string city_prefixes_en_US
, then finds that in the package environment
and eval()
's it to bring the data into the R6 object in the city_prefixes
slot. We repeat this for each data type. The result is the user initialized
class with locale specific data.Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.