knitr::opts_chunk$set(echo = TRUE)
Firstly, we create two sample dataframes.
dtf01 <- data.frame( ID = base::seq(1, 100), NAME = base::sample(x = c("John", "Steven", "Helena", "Adele", "JJ", "Amy"), size = 100, replace = TRUE), AGE = base::as.integer(stats::runif(n = 100, min = 21, max = 30)), RANDOM = stats::runif(n = 100, min = 0, max = 1), stringsAsFactors = FALSE ) dtf02 <- data.frame( ID = base::seq(1, 100), CITY = base::sample(x = c("Turin", "New York", "Milan", "Shanghai", "Paris", "Boston"), size = 100, replace = TRUE), stringsAsFactors = FALSE )
These two dataframes look like:
knitr::kable(utils::head(dtf01)) knitr::kable(utils::head(dtf02))
Now, we convert them into two dfi objects.
library(dtlng) dfi01 <- asDfi(dtf01, str_name = "dfi01") dfi02 <- asDfi(dtf02, str_name = "dfi02")
We are ready for some dplyr-style dataframe manipulations:
dfi03 <- dfi01 %>% select_("-RANDOM") %>% filter_(~(ID > 50)) %>% name("dfi03") dfi04 <- dfi02 %>% filter_(~(ID > 20)) %>% name("dfi04") dfi05 <- dfi03 %>% inner_join(dfi04, by = c("ID" = "ID")) %>% filter_(~(ID <= 90)) %>% name("dfi05")
Then, we can generate the data linea tree:
dtf_tree <- treeDtf() knitr::kable(utils::head(dtf_tree))
and plot the data lineage for the dataframes
showLineage(dtf_tree = dtf_tree, str_type = "dataframes")
and, of course, the data lineage for every single column
showLineage(dtf_tree = dtf_tree, str_type = "columns")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.