csv_to_sqlite: Save a delimited text table into a single table 'sqlite'...

View source: R/csv_to_sqlite.R

csv_to_sqliteR Documentation

Save a delimited text table into a single table sqlite database

Description

The table can be a comma separated (csv) or a tab separated (tsv) or any other delimited text file. The file is read in chunks. Each chunk is copied in the same sqlite table database before the next chunk is loaded into memory. See the INBO tutorial Handling large files in R to learn more.

Usage

csv_to_sqlite(
  csv_file,
  sqlite_file,
  table_name,
  delim = ",",
  pre_process_size = 1000,
  chunk_size = 50000,
  show_progress_bar = TRUE,
  ...
)

Arguments

csv_file

Name of the text file to convert.

sqlite_file

Name of the newly created sqlite file.

table_name

Name of the table to store the data table in the sqlite database.

delim

Text file delimiter (default ",").

pre_process_size

Number of lines to check the data types of the individual columns (default 1000).

chunk_size

Number of lines to read for each chunk (default 50000).

show_progress_bar

Show progress bar (default TRUE).

...

Further arguments to be passed to read_delim.

Value

a SQLite database

Remark

The callback argument in the read_delim_chunked function call refers to the custom written callback function append_to_sqlite applied to each chunk.

See Also

Other Data_handling_utilities: convertdf_enc(), df_factors_to_char()

Examples

## Not run: 
library(R.utils)
library(dplyr)
csv.name <- "2016-04-20-processed-logs-big-file-example.csv"
db.name <- "2016-04-20-processed-logs-big-file-example.db"
# download the CSV file example
csv.url <- paste("https://s3-eu-west-1.amazonaws.com/lw-birdtracking-data/",
  csv.name, ".gz",
  sep = ""
)
download.file(csv.url, destfile = paste0(csv.name, ".gz"))
gunzip(paste0(csv.name, ".gz"))
# Make a SQLite database
sqlite_file <- "example2.sqlite"
table_name <- "birdtracks"
csv_to_sqlite(
  csv_file = csv.name,
  sqlite_file = sqlite_file,
  table_name = table_name
)
# Get access to SQLite database
my_db <- src_sqlite(sqlite_file, create = FALSE)
bird_tracking <- tbl(my_db, "birdtracks")
# Example query via dplyr
results <- bird_tracking %>%
  filter(device_info_serial == 860) %>%
  select(date_time, latitude, longitude, altitude) %>%
  filter(date_time < "2014-07-01") %>%
  filter(date_time > "2014-03-01") %>%
  as_tibble()
head(results)

## End(Not run)

inbo/inborutils documentation built on Nov. 23, 2023, 4:42 a.m.