qrmarkdown is a simplest job queue control system for R user.
You can schedule daily, weekly task.
Air-flow, darg, many great workflow app exists but when come down to deployment on different runtime environment without your control, this becomes a big headache.
You can control everything inside of rstudio. No additional thirdparty app required.
Unix, AWS, Cloudera, Mac OSX
Download R package and allocate a directory for job queue.
install_github("okux/qrmarkdown")
Export QWD environment value in your OS. Unix Like:
export QWD='~/Desktop/timebox'
If you don't want to use QWD environment value, try this way on R console.
# Setup queue directory before calling
require(qrmarkdown)
q.wd('~/Desktop/timebox')
q.dispatcher(n=3)
jid1 <- q.push(script='wkfl.1.Rmd', name='mytest')
note: q.dispatcher(n=3) starts 3 background processes for executing rmd job.
q.show()
uses rda binary format of job status in file system. - inbox: contains a job tickets to run - outbox: contains all completed tickets - schedule: recurrent workflow ticket - n.running: number of job running - n.queue: total number of current jobs
qrmarkdown automate to run rmd script on a specific weekly & hour.
workflow types: weelky report, daily test run, model validation, monitoring anomaly.
SCRIPT <- 'fullpath/regression.test.rmd'
OUTPUT <- 'fullpath/regression.test.html'
jid <- q.schedule(wday='Monday',hour=5,
script=SCRIPT, output=OUTPUT)
q.monitor()
q.monitor starts hourly monitoring demaon in background for lunching all scheduled job.
note: ensure to use full path for script or output location otherwise all background deamon look file under QWD.
rmarkdown creates unique job id(jid) looks like this. i.e. 'f2fed362-6e6b-4aa8-90d8-02765534d8e6'
q.ls with view.output=TRUE shows specific rmd's output in a browser.
q.ls('f2fed362-6e6b-4aa8-90d8-02765534d8e6', view.output=TRUE)
use q.ls and jid to re-queue your job
require(ggplot2)
qlist <- q.ls()
plot.data <- qlist %>% select(name,secs,status)
ggplot(data=plot.data,aes(x=name,y=secs,color=status)) + geom_boxplot() + ggtitle('job execution time') + theme_minimal() + xlab('job name')
failed <- q.ls('failed', detail = TRUE) # grab detail report
q.run( failed$jid[1] )
Open your script which failed from run in rstudio. Insert browser or set breakpoint.
Use q.run will execute your script directly from console which allows you to debug a script.
# grab script failed
rstudioapi::viewer(failed[1,]$script)
By appending following lines at start of your script, restore the exact same parameters used for this job execution.
load(failed[1,]$jid) # restore job ticket parameter
params <- ticket$params # overwriting knitr params
note: give full path to jid path.
You can use single rmd file to generate multiple report by parameters.
queue.job <- function(country)
{
ret <- q.push(script="dynamic.analysis.Rmd",
params=list(country=country),
output=sprintf("~/Desktop/%s.html",country))
return(ret)
}
joblist <- lapply(countries, f=queue.job)
q.schedule(joblist, wday="Monday",hour=4) # kick off every Monday
Open rstudio terminal or other terminal application. This allows you to start qrmarkdown deamon in a separated screen.
Rscript -e "qrmarkdown::q.dispatcher(n=8, dir='your.queue.log.dir')"
Rscript -e "qrmarkdown::q.monitor(wdir='your.queue.log.dir')"
qrmarkdown safely shutdown a job by waiting all running job to be finished. There is no mechanism implemented to force to kill a job.
q.shutdown()
q.dispatcher(n=10)
q.rm('outbox/*') # delete all completed job log
q.rm('*') # delete all completed job log
See next chapter for more advanced user.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.