Suppose you have a hypothetical company with customers from different regions. You have in-house data uses that usually follow a buffet style data pulling from the database and calculating the KPI's
``` get_data >> add_purchase >> calc_interest >> calc_recency >> plot >> report >> share
## Why have internal package?
The package will be developed for four main components:<br>
(i)Data pipeline API optimized for the company<br>
(ii)Company brand visualization themes for ggplot2<br>
(iii)R Markdown templates for different types of reports for each department,<br>
(iv)Vignettes and documentation for all SQL queries and one centralized repo of queries <br>
(v)Custom functions to optimize different parts of our workflow.<br>
(vi)Vignettes and help page for SQL Queries
(vii)Version control for SQL Queries
(viii)Easy to get rolling for non-technical users
**What is Buffet style data workflow?**
> Buffet style workflow usually save a lot on analyst's time because it’s much easier to continually document r functions of the same table and database than to let each analyst write their own sql queries and arrange different tables.This is automating the repetitive data workflow. Data changes but code remains centralized and well managed and documented by few maintainer.<br>
This kind of workflow can be vastly improved using piping in R and database instead of having each individual analyst write disparate SQL queries.
This package solves exactly the similar kind of problems in *internal data tooling* of the companies. FasteneR uses **fastener-functions** like above to build the workflow.
<br>
**fastener-functions** are the custom functions that have predefined usage.We know exactly what they fit into(our database and tables) and what they hold(our reporting workflow).<br>
Internal Data Tooling:
Different department are well aware of the table names in the databases. The function names right now assumes such. But these functions can be made general which might affect the readability of the code for common workflow. `dbplyr` already does this.
The same workflow can give you reports and plots along with database queries. Since, its an R-script, whole bunch of goodies that come with R/Rstudio is now available including automation of common reports , composing different R-scripts for projects that span across the department.This workflow Increase the frequency of analysis and reportings.It encourages analysis.
Below, the code is querying from database , applying filters and calculating metrics.
```#demo
project_sales_us<- fastener("../sales.db","sales_from_us")
project_sales_us %>%
fr_get_customer_from("US") %>%
fr_get_transaction(1) %>%
fr_calc_metric_recency() %>%
fr_calc_metric_frequency() %>%
fr_calc_metric_monetary() %>%
fr_generate_cluster()%>%
fr_plot() %>%
fr_generate_pdf_report() %>%
fr_end_project()
Those cases can also be represented as fastener-functions.
query = "SELECT {column} FROM {table} WHERE {filter_by} == {filter_value}"
#sql fasteners object Separates SQL command with parameters
query_example<-sql_fasteners(sql_string = query,parameters = list(column = 'CustomerID',
table = 'customers',
filter_by ='Country',
filter_value = "\'US\'"))
fr_get_customers_from<-generate_fr_func(query = query_example)
k %>% fr_get_customers_from() %>%
fr_get_transaction_after("2012-01-02") %>%
fr_add_product("Milk") %>%
fr_get_resulting_dataframe()
The analyst will have to make a pull request on the package to include the function they made along with SQL explanation and vignettes.Specialized SQL_fastener class to separate SQL command and Data parameter is used.
At any moment in piping , one can use project_sales_us$data
to get the dataframe nice and formatted, ready to use ggplot2 and rmarkdown for report. All the goodies in Rstudio like
-vignettes,
-help documentation,
-testing
-git
Features to be added A logging mechanism of what fasteneR functions were used to make the project object A rich color display in the console printing on what is happening like what table is being queried, number of rows and columns Using dbplyr to figure out to show/print the query in SQL Include workflowR automatically to make a rich project
To Install and Try
devtools::install_github('KapilKhanal/FasteneR')
inspiration:
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.