noctua
is dependent on data.table
to read data into R
. This is down to the amazing speed data.table
offers when reading files into R
. However a new package, with equally impressive read speeds, has come onto the scene called vroom
. As vroom
has been designed to only read data into R
, similarly to readr
, data.table
is still used for all of the heavy lifting. However if a user wishes to use vroom
as the file parser, noctua_options
function has been created to enable this:
library(DBI) library(noctua) con = dbConnect(athena()) noctua_options(file_parser = c("data.table", "vroom"))
By setting the file_parser
to "vroom"
then the backend will change to allow vroom
's file parser to be used instead of data.table
.
data.table
To go back to using data.table
as the file parser it is a simple as calling the noctua_options
function:
# return to using data.table as file parser noctua_options()
This makes it very flexible to swap between each file parser even between each query execution:
library(DBI) library(noctua) con = dbConnect(athena()) # upload data dbWriteTable(con, "iris", iris) # use default data.table file parser df1 = dbGetQuery(con, "select * from iris") # use vroom as file parser noctua_options("vroom") df2 = dbGetQuery(con, "select * from iris") # return back to data.table file parser noctua_options() df3 = dbGetQuery(con, "select * from iris")
vroom
?If you aren't sure whether to use vroom
over data.table
, I draw your attention to vroom
boasting a whopping 1.40GB/sec throughput.
Statistics taken from vroom's github readme
package | version | time (sec) | speed-up | throughput ---|---|---|---|--- vroom | 1.1.0 | 1.14 | 58.44 | 1.40 GB/sec data.table | 1.12.8 | 11.88 | 5.62 | 134.13 MB/sec readr | 1.3.1 | 29.02 | 2.30 | 54.92 MB/sec read.delim | 3.6.2 | 66.74 | 1.00 | 23.88 MB/sec
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.