library("purrr") library("tallyr")
The main goal of tallyr is to provide a quick and easy way to monitor progress whilst iterating through a data frame, applying a function to each row at a time. This can be useful when the time taken for this step is sufficiently long enough to run the script in the background, coming back to the console at regular intervals to check the progress. To help with monitoring this progress, the number of rows remaining can be displayed in the console so that you can see at a glance how far the script has progressed and how far there is left to go.
The package is on github and can be installed through the remotes package
# install.packages("remotes") remotes::install_github("gcfrench/tallyr")
There are two function in the package
tally_counter()
initiates the counter setting the count to the number of rows to be iterated
through. The counter counts downwards to zero during the iteration step. This behaviour can be
changed through the type argument so that instead of counting downwards the counter counts upwards
from zero to the total number of iterations. click()
updates the counter, either decreasing or increasing the count by one. This up-to-date
count can then be displayed in the console.The counter is initiated by passing the data frame into the tally_counter()
function. This can be
done within a pipeline containing the iteration step, for example in conjuction with the tidyverse
suite of packages.
To demonstrate this take the first five rows of the iris dataset passing each row, one at a time, into a function that returns the sepal length for that particular flower.
output <- iris[1:5, ] %>% pmap( function(...) { message(paste0("The sepal length is ", pluck(list(...), "Sepal.Length")), " cm") Sys.sleep(0.25) })
In this example case there is a small number of rows within the data frame but if it was larger enough to run the script in the background then there is no indication of how far the script is progressing and how many more rows there is left to go.
This is where using the tally counter can help. This counter can be initiated just before running
the iteration step by adding the tally_counter()
function within the pipeline. On running the pipe an initial message will show the counter set to the number of rows to be iterated over.
output <- iris[1:5, ] %>% tally_counter() %>% pmap( function(...) { message(paste0("The sepal length is ", pluck(list(...), "Sepal.Length")), " cm") Sys.sleep(0.25) })
Displaying the count during each iteration step is now easily done by adding the click()
function
to a message within your function. Each time the function is run the counter will be updated and
displayed in the console.
output <- iris[1:5, ] %>% tally_counter() %>% pmap( function(...) { message(paste0(click(), ": The sepal length is ", pluck(list(...), "Sepal.Length")), " cm") Sys.sleep(0.25) })
By default the counter counts down from the total number of rows to be iterated over, allowing a quick assessment of how far there is to go with the iteration step, but this behaviour can be changed so that instead the counter counts up from zero.
You can alter this count behaviour via the type
argument. On passing add to this argument the
counter will increase sequentially, whilst passing subtract the counter will decrease
sequentially.
output <- iris[1:5, ] %>% tally_counter(type = "add") %>% pmap( function(...) { message(paste0(click(), ": The sepal length is ", pluck(list(...), "Sepal.Length")), " cm") Sys.sleep(0.25) })
The number of digits displayed by the counter is set to four to mimic the appearance of a real tally counter. However this number of digits is not fixed and will increase to accommodate increasing number of iterations, so for example above 9,999 iterations the number of digits increases to five, above 99,999 iterations six digits will be displayed and so on.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.