default_simulation_params: Setup default column type parameters

default_simulation_paramsR Documentation

Setup default column type parameters

Description

All the parameters (excluding regexp) are attached to column definition when the ones are not specified in configuration YAML file. All the functions are used to specify default configuration (see: default_faker_opts).

Usage

opt_default_character(
  regexp = "text|char|factor",
  nchar = 10,
  na_ratio = 0.05,
  not_null = FALSE,
  unique = FALSE,
  default = "",
  levels_ratio = 1,
  ...
)

opt_default_numeric(
  regexp = "^decimal|^numeric|real|double precision",
  na_ratio = 0.05,
  not_null = FALSE,
  unique = FALSE,
  default = 0,
  precision = 7,
  scale = 2,
  levels_ratio = 1,
  ...
)

opt_default_integer(
  regexp = "smallint|integer|bigint|smallserial|serial|bigserial",
  na_ratio = 0.05,
  not_null = FALSE,
  unique = FALSE,
  default = "",
  levels_ratio = 1,
  ...
)

opt_default_logical(
  regexp = "boolean|logical",
  na_ratio = 0.05,
  not_null = FALSE,
  unique = FALSE,
  default = FALSE,
  levels_ratio = 1,
  ...
)

opt_default_date(
  regexp = "date|Date",
  na_ratio = 0.05,
  not_null = FALSE,
  unique = FALSE,
  default = Sys.Date(),
  format = "%Y-%m-%d",
  min_date = as.Date("1970-01-01"),
  max_date = Sys.Date(),
  levels_ratio = 1,
  ...
)

Arguments

regexp

Regular expression that allows mapping YAML configuration column type to desired R class.

nchar

Maximum number of characters when simulating character values. When source column is of type char(n) the parameter is ignored.

na_ratio

Ratio of NA values returned in simulated sample.

not_null

Should the column allow to simulate NA values?

unique

Should column values be unique?

default

Default column value. Ignored during simulation.

levels_ratio

Ratio of unique values (in terms of sample length) simulated in the sample.

...

Other default parameters attached to the column definition.

precision

Precision of numeric column value when simulating numeric values. When source column is of type e.g. numeric(precision) the parameter is ignored.

scale

Precision of numeric column value when simulating numeric values. When source column is of type e.g. numeric(precision, scale) the parameter is ignored.

format

Format of date used when simulating Date columns.

min_date, max_date

Minimum and maximum date used when simulating Date columns.


DataFakeR documentation built on Feb. 16, 2023, 7:38 p.m.