README.md

tallyr

Lifecycle: experimental

Overview

The main goal of tallyr is to provide a quick and easy way to monitor progress whilst iterating through a data frame, applying a function to each row at a time. This can be useful when the time taken for this step is sufficiently long enough to run the script in the background, coming back to the console at regular intervals to check the progress. To help with monitoring this progress, the number of rows remaining can be displayed in the console so that you can see at a glance how far the script has progressed and how far there is left to go.

Installation

The package is on github and can be installed through the remotes package ```{r, eval = FALSE}

install.packages("remotes")

remotes::install_github("gcfrench/tallyr")


There are two function in the package

* `tally_counter()` initiates the counter setting the count to the number of rows to be iterated 
through. The counter counts downwards to zero during the iteration step. This behaviour can be 
changed through the type argument so that instead of counting downwards the counter counts upwards
from zero to the total number of iterations. 
* `click()` updates the counter, either decreasing or increasing the count by one. This up-to-date 
count can then be displayed in the console.

## Using the counter

The counter is initiated by passing the data frame into the `tally_counter()` function. This can be
done within a pipeline containing the iteration step, for example in conjuction with the tidyverse
suite of packages.

To demonstrate this take the first five rows of the iris dataset passing each row, one at a time,
into a function that returns the sepal length for that particular flower. The map and pluck functions
from the purrr package are used to pass and extract the sepal length from each row. 

```{r}
library(tallyr)
library(purrr)
output <- iris[1:5, ] %>%
  pmap(
    function(...) {
      message(paste0("The sepal length is ", pluck(list(...), "Sepal.Length")), " cm")
      Sys.sleep(0.25)
    })

In this example case there is a small number of rows within the data frame but if it was larger enough to run the script in the background then there is no indication of how far the script is progressing and how many more rows there is left to go.

This is where using the tally counter can help. This counter can be initiated just before running the iteration step by adding the tally_counter() function within the pipeline. On running the pipe an initial message will show the counter set to the number of rows to be iterated over.

output <- iris[1:5, ] %>%
  tally_counter() %>%
  pmap(
    function(...) {
      message(paste0("The sepal length is ", pluck(list(...), "Sepal.Length")), " cm")
      Sys.sleep(0.25)
    })

Displaying the count during each iteration step is now easily done by adding the click() function to a message within your function. Each time the function is run the counter will be updated and displayed in the console.

output <- iris[1:5, ] %>%
  tally_counter() %>%
  pmap(
    function(...) {
      message(paste0(click(), ": The sepal length is ", pluck(list(...), "Sepal.Length")), " cm")
      Sys.sleep(0.25)
    })

Changing the counter behaviour

By default the counter counts down from the total number of rows to be iterated over, allowing a quick assessment of how far there is to go with the iteration step, but this behaviour can be changed so that instead the counter counts up from zero.

You can alter this count behaviour via the type argument. On passing add to this argument the counter will increase sequentially, whilst passing subtract the counter will decrease sequentially.

output <- iris[1:5, ] %>%
  tally_counter(type = "add") %>%
  pmap(
    function(...) {
      message(paste0(click(), ": The sepal length is ", pluck(list(...), "Sepal.Length")), " cm")
      Sys.sleep(0.25)
    })

The number of digits displayed by the counter is set to four to mimic the appearance of a real tally counter. However this number of digits is not fixed and will increase to accommodate increasing number of iterations, so for example above 9,999 iterations the number of digits increases to five, above 99,999 iterations six digits will be displayed and so on.

More information

For more information, read the tallyr vignette

browseVignettes(package = "tallyr")


gcfrench/tallyr documentation built on Oct. 8, 2019, 5:17 p.m.