oomdata_dbi: Iterate over 'tibble', 'DBIResult' or 'connection' in chunks.

Description Usage Arguments Details Examples

View source: R/oomdata_funs.R

Description

Returns a function that repeats the following cycle: iteratively return chunk_size number of rows from data until data is exhausted; then return NULL once.

Usage

1
2
3
4
5
oomdata_tbl(data, chunk_size, ...)

oomdata_dbi(data, chunk_size, ...)

oomdata_con(data, chunk_size, header = TRUE, col_names = NULL, ...)

Arguments

data

A tibble, DBIResult, or connection object.

chunk_size

The number of chunks to return with each iteration.

...

Ignored.

header

When TRUE, colnames are determined from first row of a connection. If FALSE, col_names must be provided.

col_names

A character vector to use as column names. Overrides column names determined from first row of connection when header is TRUE.

Details

oomdata_* functions create functions that iteratively return chunk_size number of rows from data until all rows have been returned. They will then return NULL once. They repeat this cycle ad-infinitum.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# `oomdata_tbl()` returns an `oomdata` function that when called will
# return `chunk_size` rows from a `tbl_df`
chunks <- oomdata_tbl(mtcars, chunk_size = 16)

nrow(chunks())
nrow(chunks())

# when the data is exhausted the `oomdata` function
# will return NULL once.
chunks()

# subsequent calls restart the cycle
nrow(chunks())
nrow(chunks())
chunks()

# `while` loops are useful for iterating over 
# `oomdata` functions
while(!is.null(chunk <- chunks())){
  print(nrow(chunk))
}

# use `tidy()` to get information about `oomdata` status
tidy(chunks)

# `oomdata_dbi()` returns a function that when called will
# return `chunk_size` rows from a query result set.
con <- DBI::dbConnect(RSQLite::SQLite(), path = ":dbname:")
dplyr::copy_to(con, mtcars, "mtcars", temporary = FALSE)
rs  <- DBI::dbSendQuery(con, "SELECT mpg, cyl, disp FROM mtcars")

chunks <- oomdata_dbi(rs, 16)

while(!is.null(chunk <- chunks())){
  print(nrow(chunk))
}

# ploom model functions automatically iterate over `oomdata` until 
# the source is exhausted (`oomlm`, `oomlm_robust`) or until 
# IRLS convergence (`oomglm``)
x <- fit(oomlm(mpg ~ cyl + disp), chunks)
y <- fit(oomglm(mpg ~ cyl + disp), chunks)

coef(x)
coef(y)

blakeboswell/ploom documentation built on May 25, 2019, 3:24 p.m.