knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

tricordr

Lifecycle: experimental CRAN status Travis build status Codecov test coverage

The goal of tricordr is to integrate process provenance and instrumentation to pipelines of tidyverse primatives

Installation

You can install the released version of tricordr from CRAN with:

install.packages("tricordr")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("bvancil/tricordr")

Example

Warning: None of this actually works yet.

Eventually, we want full provenance, but this is sort of what we're going for.

We'll start by creating some test data.

library('dplyr')
library('tibble')
library('tricordr')

# Here's some test data
data_size <- 100L
test_data <- tibble::tibble(
  x = base::sample(c(0L, 1L), data_size, replace = TRUE),
  y = base::sample(base::seq(0L, 2L), data_size, replace = TRUE)
)

In our data pipeline, we want to track what happens to our data. For instance, we might want to add another variable and transform the others.

final_data <- test_data %>% 
  dplyr::mutate(z = x * y, y1 = y, y = x, x = 2L - y1)

What happened? It will be tricky to figure out later. Instead, we can use tricordr to decorate the dplyr::mutate function so that we keep track.

test_provenance <- tricordr::Provenance$new()
test <- test_provenance$wrap_operations()

# Now we change `dplyr::mutate` to `test$mutate`
final_data <- test_data %>% 
  test$mutate(z = x * y, y1 = y, y = x, x = 2L - y1)

print(test_provenance)

We can create difference provenances for different pipelines and combine them later.

Code of conduct

Please note that the 'tricordr' project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Similar work



bvancil/tricordr documentation built on Jan. 24, 2020, 1:25 a.m.