This is my personal R package which contains a number of functions that help me to maintain an organized workflow. Most of the functions are just curried versions of functions from other packages, nevertheless I added a decent amount of documentation, however there are a few packages that I cosntantly use and that one should know about in order to understand the documentation.
twidlr. A lot of my code examples use these two packages and I even use them in some of my functions. They also have a nice tutorials on their github page and a great simple syntax that make it easy to get the advantages of dataframe based modelling concepts. Those two packages are great however there is a major disadvantage in my opinion. If you train a lot of models
pipelearnerwill always store them in memory. R models tend to be very large so you quickly fill up your memory on weaker machines. Further you cannot train a model whithout defining some sort of test set. Which sometimes had me define a test set with one observation. So at the moment I still use
pipelearnerfor its great syntax but tend to write my own modelling dataframes and to use the tools devloped by Max Kuhn
rmarkdownto its full capacity. In my project folder I usually have one folder for
Rmdand one for
htmlfiles. I have one
Rmdfile for each step in my workflow and all resulting
hmtls are rendered to the
htmlfolder. I have one
execute.Rfile in the parent project folder which triggers rendering of all
Rmds. The last
Rmdfile to be rendered will generate a index.html file which is basically a catalogue file with links to all
index.htmlfile can be found in the project parent folder. I try to use widgets over static plots and tables whenever I can. It helps to know the following packages at least bit
The names of all functions start with
f_ followed by another prefix that describes the role of the function inside my workflow
When thinking about this step I did not know about the
recipes package. I came up with something similar but less elabprate and
These functions help me to generate html output.
These functions start a shinyserver and let you run a simulation or a shiny app.
f_shiny_multiviewruns a tool for analysing labeled groups in a data set
f_shiny_somruns a tool for training self organizing maps. The subsequent clustering algorithm allows you to cluster only adjacent areas of the map.
You can pass your own data via the
data argument to both functions, or select from a variety of sample datasets if
data == NULL.
Apart from the vignettes that I have linked to already above I have added some files that I produced as prrof of concept (POC) files.
When I try out a new package or get stuck somewhere with a problem I like to write those as a minimal example of how to solve or not to solve a problem. One can use the function
f_content() to produce and open a index
html file with links to all those files.
Some of the POCs are also posted on Rpubs
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.