Few nice functions for every-day use
install.packages("devtools")
devtools
packagelibrary(devtools)
install_github("AZhyrenkov/niceR")
library(niceR)
x %+% y
Only one query per file.
read_query(path)
- read single query.
Read all queries from sql
directory and execute them:
library(magrittr)
library(purrr)
library(dplyr)
library(RPostgreSQL) #your_database_library
library(stringr)
queries <- list.files('sql/',full.names = T)
all_data <- tibble(queries) %>%
rowwise() %>%
mutate(query = map_chr(queries, read_query), # read all queries
name = str_sub(str_extract(string = queries, '/.+\\.'),2,-2), # extract file name as query name
result = map(query,~dbGetQuery(db, query)) # execute query
)
killMySQLConnections()
- will print all open connections and close them.
write_data_redshift(con,df,name, chunk = 100000)
read_data(con,chunk = 100000, table, order_column = 'Id')
Copy data from DBI-formal-class connection database (tested with MySQL) to redshift by chunks
copy_data(redshift, mysql, from_table, to_table, chunk = 100000, order_column = 'Id', rewrite = T, test = F)
redshift - RPostgreSQL formal class database connection? Targeted database
mysql - DBI formal class database connection (RMySQL, RPostgreSQL), source database
to_table - string with final table name
order_column - Name of Primary key to sort during splitting by chunks, must be unique.
chunk - chunk-size, default 100000
rewrite - If False
will append data to existing table
* test - If TRUE
will be limited to 3 iterations
Thanks to lxii for contribution!
Note: Dataset in tidy format, months must be factor variables. If not, run something like this:
mutate(x, month = factor(month, levels = c('Feb', 'Mar', 'Apr', 'May'), ordered = T)
Using:
getBiathlon(x, length)
x - the month or time period factor variable
length - the number of periods to build the biathlon for
Note: monthly data must be in columns
make_biatlon <- (dataset,columns,base_column)
dataset - input dataframe
columns - vector with strings names of montly data. Without NAs
* base_column - column to compare with.
Example:
make_biatlon(ds,month_range$MYM[1:4],threshold)
get_retentions(biathlon)
* biathlon - vector with all finpattern data
pool_ds
consist of two columns:
Example:
# Distribution
distribution <- distribute_honestly(pool_ds)
rm(pool_ds)
Gathering results:
# How many rows per one distribution_value
capacity <- distribution$capacity
# final dataset
distribution_result <- distribution$distribution_result
distribution <- distribute(pool_ds,capacity = capacity)
Log errors and warnings to slack channel.
Thanks to lxii for contribution!
Using: errorHandler(code, prefix)
Example:
library(slackr)
slackr_setup(config_file = slackr'), echo = T) # configure slackR bot
errorHandler({
# some code, connection to database e.g.
}, prefix = 'Connection to database is broken')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.