knitr::opts_chunk$set(eval = FALSE) library(chunked)
chunked
?Short answer: \begin{center} \includegraphics[width=0.2\textwidth]{img/dplyr_logo} \Huge{for data in text files} \end{center}
\hfill\includegraphics[width=0.1\textwidth]{img/txtfile}
\vspace{-1.6cm}
readr::read_csv
datatable::fread
data.frame
does not! sed
awk
grep
It is nice to stay in R
-universe (one data-processing tool)
sed
, awk
and grep
voodoo.\begin{center} \includegraphics[height=0.8\textheight]{img/keep-calm-and-chop-chop-3} \end{center}
dplyr
verbsLaF
.dplyr
verbs on chunk_wise
objects are recorded and replayed when
writing.read_chunkwise("my_data.csv", chunk_size = 5000) %>% select(col1, col2) %>% filter(col1 > 1) %>% mutate(col3 = col1 + 1) %>% write_chunkwise("output.csv")
This code:
db <- src_sqlite('test.db', create=TRUE) tbl <- read_chunkwise("./large_file_in.csv") %>% select(col1, col2, col5) %>% filter(col1 > 10) %>% mutate(col6 = col1 + col2) %>% write_chunkwise(db, 'my_large_table')
tbl<- ( src_sqlite("test.db") %>% tbl("my_table") ) %>% read_chunkwise(chunk_size=5000) %>% select(col1, col2, col5) %>% filter(col1 > 10) %>% mutate(col6 = col1 + col2) %>% write_chunkwise('my_large_table.csv')
filter
, select
, rename
,mutate
,mutate_each
,transmute
,do
,
tbl_vars
, inner_join
, left_join
, semi_join
,anti_join
all work
, also with name completion!summarize
and group_by
work chunkwise (and not for all data!)arrange
, right_join
, full_join
\Large{Interested?}
install.packages("chunked")
Or visit http://github.com/edwindj/chunked
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.