yaml_write | R Documentation |
With yaml_write()
we can take different pointblank objects (these are
the ptblank_agent
, ptblank_informant
, and tbl_store
) and write them to
YAML. With an agent, for example, yaml_write()
will write that everything
that is needed to specify an agent and it's validation plan to a YAML file.
With YAML, we can modify the YAML markup if so desired, or, use as is to
create a new agent with the yaml_read_agent()
function. That agent will
have a validation plan and is ready to interrogate()
the data. We can go a
step further and perform an interrogation directly from the YAML file with
the yaml_agent_interrogate()
function. That returns an agent with intel
(having already interrogated the target data table). An informant object
can also be written to YAML with yaml_write()
.
One requirement for writing an agent or an informant to YAML is that we
need to have a table-prep formula specified (it's an R formula that is used
to read the target table when interrogate()
or incorporate()
is called).
This option can be set when using create_agent()
/create_informant()
or
with set_tbl()
(useful with an existing agent or informant object).
yaml_write(
...,
.list = list2(...),
filename = NULL,
path = NULL,
expanded = FALSE,
quiet = FALSE
)
... |
Pointblank agents, informants, table stores
Any mix of pointblank objects such as the agent
( |
.list |
Alternative to
Allows for the use of a list as an input alternative to |
filename |
File name
The name of the YAML file to create on disk. It is recommended that either
the |
path |
File path
An optional path to which the YAML file should be saved (combined with
|
expanded |
Expand validation when repeating across multiple columns
Should the written validation expressions for an agent be expanded such
that tidyselect expressions for columns are evaluated, yielding a
validation function per column? By default, this is |
quiet |
Inform (or not) upon file writing
. Should the function not inform when the file is written? |
Invisibly returns TRUE
if the YAML file has been written.
agent
object to a YAML fileLet's go through the process of developing an agent with a validation plan.
We'll use the small_table
dataset in the following examples, which will
eventually offload the developed validation plan to a YAML file.
small_table #> # A tibble: 13 x 8 #> date_time date a b c d e f #> <dttm> <date> <int> <chr> <dbl> <dbl> <lgl> <chr> #> 1 2016-01-04 11:00:00 2016-01-04 2 1-bcd-345 3 3423. TRUE high #> 2 2016-01-04 00:32:00 2016-01-04 3 5-egh-163 8 10000. TRUE low #> 3 2016-01-05 13:32:00 2016-01-05 6 8-kdg-938 3 2343. TRUE high #> 4 2016-01-06 17:23:00 2016-01-06 2 5-jdo-903 NA 3892. FALSE mid #> 5 2016-01-09 12:36:00 2016-01-09 8 3-ldm-038 7 284. TRUE low #> 6 2016-01-11 06:15:00 2016-01-11 4 2-dhe-923 4 3291. TRUE mid #> 7 2016-01-15 18:46:00 2016-01-15 7 1-knw-093 3 843. TRUE high #> 8 2016-01-17 11:27:00 2016-01-17 4 5-boe-639 2 1036. FALSE low #> 9 2016-01-20 04:30:00 2016-01-20 3 5-bce-642 9 838. FALSE high #> 10 2016-01-20 04:30:00 2016-01-20 3 5-bce-642 9 838. FALSE high #> 11 2016-01-26 20:07:00 2016-01-26 4 2-dmx-010 7 834. TRUE low #> 12 2016-01-28 02:51:00 2016-01-28 2 7-dmx-010 8 108. FALSE low #> 13 2016-01-30 11:23:00 2016-01-30 1 3-dka-303 NA 2230. TRUE high
Creating an action_levels
object is a common workflow step when creating a
pointblank agent. We designate failure thresholds to the warn
, stop
,
and notify
states using action_levels()
.
al <- action_levels( warn_at = 0.10, stop_at = 0.25, notify_at = 0.35 )
Now let's create the agent
and pass it the al
object (which serves as a
default for all validation steps which can be overridden). The data will be
referenced in tbl
with a leading ~
and this is a requirement for writing
to YAML since the preparation of the target table must be self contained.
agent <- create_agent( tbl = ~ small_table, tbl_name = "small_table", label = "A simple example with the `small_table`.", actions = al )
Then, as with any agent
object, we can add steps to the validation plan by
using as many validation functions as we want.
agent <- agent %>% col_exists(columns = c(date, date_time)) %>% col_vals_regex( columns = b, regex = "[0-9]-[a-z]{3}-[0-9]{3}" ) %>% rows_distinct() %>% col_vals_gt(columns = d, value = 100) %>% col_vals_lte(columns = c, value = 5)
The agent can be written to a pointblank-readable YAML file with the
yaml_write()
function. Here, we'll use the filename
"agent-small_table.yml"
and, after writing, the YAML file will be in the
working directory:
yaml_write(agent, filename = "agent-small_table.yml")
We can view the YAML file in the console with the yaml_agent_string()
function.
yaml_agent_string(filename = "agent-small_table.yml")
type: agent tbl: ~small_table tbl_name: small_table label: A simple example with the `small_table`. lang: en locale: en actions: warn_fraction: 0.1 stop_fraction: 0.25 notify_fraction: 0.35 steps: - col_exists: columns: c(date, date_time) - col_vals_regex: columns: c(b) regex: '[0-9]-[a-z]{3}-[0-9]{3}' - rows_distinct: columns: ~ - col_vals_gt: columns: c(d) value: 100.0 - col_vals_lte: columns: c(c) value: 5.0
Incidentally, we can also use yaml_agent_string()
to print YAML in the
console when supplying an agent as the input. This can be useful for
previewing YAML output just before writing it to disk with yaml_write()
.
agent
object from a YAML fileThere's a YAML file available in the pointblank package that's also
called "agent-small_table.yml"
. The path for it can be accessed through
system.file()
:
yml_file_path <- system.file( "yaml", "agent-small_table.yml", package = "pointblank" )
The YAML file can be read as an agent with a pre-existing validation plan by
using the yaml_read_agent()
function.
agent <- yaml_read_agent(filename = yml_file_path) agent
This particular agent is using ~ tbl_source("small_table", "tbl_store.yml")
to source the table-prep from a YAML file that holds a table store (can be
seen using yaml_agent_string(agent = agent)
). Let's put that file in the
working directory (the pointblank package has the corresponding YAML
file):
yml_tbl_store_path <- system.file( "yaml", "tbl_store.yml", package = "pointblank" ) file.copy(from = yml_tbl_store_path, to = ".")
As can be seen from the validation report, no interrogation was yet
performed. Saving an agent to YAML will remove any traces of interrogation
data and serve as a plan for a new interrogation on the same target table. We
can either follow this up with with interrogate()
and get an agent with
intel, or, we can interrogate directly from the YAML file with
yaml_agent_interrogate()
:
agent <- yaml_agent_interrogate(filename = yml_file_path) agent
informant
object to a YAML fileLet's walk through how we can generate some useful information for a really
small table. We can create an informant
object with create_informant()
and we'll again use the small_table
dataset.
informant <- create_informant( tbl = ~ small_table, tbl_name = "small_table", label = "A simple example with the `small_table`." )
Then, as with any informant
object, we can add info text to the
using as many info_*()
functions as we want.
informant <- informant %>% info_columns( columns = a, info = "In the range of 1 to 10. (SIMPLE)" ) %>% info_columns( columns = starts_with("date"), info = "Time-based values (e.g., `Sys.time()`)." ) %>% info_columns( columns = date, info = "The date part of `date_time`. (CALC)" )
The informant can be written to a pointblank-readable YAML file with the
yaml_write()
function. Here, we'll use the filename
"informant-small_table.yml"
and, after writing, the YAML file will be in
the working directory:
yaml_write(informant, filename = "informant-small_table.yml")
We can inspect the YAML file in the working directory and expect to see the following:
type: informant tbl: ~small_table tbl_name: small_table info_label: A simple example with the `small_table`. lang: en locale: en table: name: small_table _columns: 8 _rows: 13.0 _type: tbl_df columns: date_time: _type: POSIXct, POSIXt info: Time-based values (e.g., `Sys.time()`). date: _type: Date info: Time-based values (e.g., `Sys.time()`). The date part of `date_time`. a: _type: integer info: In the range of 1 to 10. (SIMPLE) b: _type: character c: _type: numeric d: _type: numeric e: _type: logical f: _type: character
informant
object from a YAML fileThere's a YAML file available in the pointblank package that's also
called "informant-small_table.yml"
. The path for it can be accessed through
system.file()
:
yml_file_path <- system.file( "yaml", "informant-small_table.yml", package = "pointblank" )
The YAML file can be read as an informant by using the
yaml_read_informant()
function.
informant <- yaml_read_informant(filename = yml_file_path) informant
As can be seen from the information report, the available table metadata was
restored and reported. If you expect metadata to change with time, it might
be beneficial to use incorporate()
to query the target table. Or, we can
perform this querying directly from the YAML file with
yaml_informant_incorporate()
:
informant <- yaml_informant_incorporate(filename = yml_file_path)
There will be no apparent difference in this particular case since
small_data
is a static table with no alterations over time. However,
using yaml_informant_incorporate()
is good practice since this refreshing
of data will be important with real-world datasets.
11-1
Other pointblank YAML:
yaml_agent_interrogate()
,
yaml_agent_show_exprs()
,
yaml_agent_string()
,
yaml_exec()
,
yaml_informant_incorporate()
,
yaml_read_agent()
,
yaml_read_informant()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.