Creates a data.frame that help batching long-running read and writes.

Description

The function returns a data.frame that other functions use to separate long-running read and write REDCap calls into multiple, smaller REDCap calls. The goal is to (1) reduce the chance of time-outs, and (2) introduce little breaks between batches so that the server isn't continually tied up.

Usage

1
create_batch_glossary(row_count, batch_size)

Arguments

row_count

The number records in the large dataset, before it's split.

batch_size

The maximum number of subject records a single batch should contain.

Details

This function can also assist splitting and saving a large data.frame to disk as smaller files (such as a .csv). The padded columns allow the OS to sort the batches/files in sequential order.

Value

Currently, a data.frame is returned with the following columns,

  1. id: an integer that uniquely identifies the batch, starting at 1.

  2. start_index: the index of the first row in the batch. integer.

  3. stop_index: the index of the last row in the batch. integer.

  4. id_pretty: a character representation of id, but padded with zeros.

  5. start_index: a character representation of start_index, but padded with zeros.

  6. stop_index: a character representation of stop_index, but padded with zeros.

  7. label: a character concatenation of id_pretty, start_index, and stop_index_pretty.

Author(s)

Will Beasley

See Also

See redcap_read for a function that uses create_batch_gloassary.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
library(REDCapR) #Load the package into the current R session.
create_batch_glossary(100, 50)
create_batch_glossary(100, 25)
create_batch_glossary(100, 3)
d <- data.frame(
  record_id = 1:100,
  iv        = sample(x=4, size=100, replace=TRUE),
  dv        = rnorm(n=100)
)
create_batch_glossary(nrow(d), batch_size=40)