knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The goal of dcmodifydb is to apply modification rules specified with dcmodify
on a database table, allowing for documented, reproducable data cleaning adjustments
in a database.
dcmodify
separates intent from execution: a user specifies what, why and how of an automatic data change and uses dcmodifydb
to execute them on a tbl
database table.
The development version from GitHub can be installed with:
# install.packages("devtools") devtools::install_github("data-cleaning/dcmodifydb")
library(DBI) library(dcmodify) library(dcmodifydb) con <- dbConnect(RSQLite::SQLite())
You can use YAML to store the modification rules: "example.yml"
```r
Let's load the rules and apply them to a data set: ```r m <- modifier(.file = "example.yml")
m <- modifier(.file = "example/example.yml")
print(m)
# setup the data "age, income 11, 2000 150, 300 25, 2000 -10, 2000 " -> csv income <- read.csv(text = csv, strip.white = TRUE) dbWriteTable(con, "income", income) tbl_income <- dplyr::tbl(con, "income") # this is the table in the data base tbl_income # and now after modification modify(tbl_income, m, copy = FALSE)
Generated sql can be written with dump_sql
dump_sql(m, tbl_income, file = "modify.sql")
modify.sql:
```r
dump_sql(m, tbl_income)
```r dbDisconnect(con)
Note: Modification rules can be written to yaml with as_yaml
and export_yaml
.
dcmodify::export_yaml(m, "cleaning_steps.yml")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.