build_ecotox_sqlite: Build an SQLite database from zip archived tables downloaded...

View source: R/init.r

build_ecotox_sqliteR Documentation

Build an SQLite database from zip archived tables downloaded from EPA website


This function is called automatically after download_ecotox_data. The database files can also be downloaded manually from the EPA website from which a local database can be build using this function.


build_ecotox_sqlite(source, destination = get_ecotox_path(), write_log = TRUE)



A character string pointing to the directory path where the text files with the raw tables are located. These can be obtained by extracting the zip archive from and look for 'Download ASCII Data'.


A character string representing the destination path for the SQLite file. By default this is get_ecotox_path().


A logical value indicating whether a log file should be written in the destination path TRUE. The log contains information on the source and destination path, the version of this package, the creation date, and the operating system on which the database was created.


Raw data downloaded from the EPA website is in itself not very efficient to work with in R. The files are large and would put a large strain on R when loading completely into the system's memory. Instead use this function to build an SQLite database from the tables. That way, the data can be queried without having to load it all into memory.

EPA provides the raw table from the ECOTOX database as text files with pipe-characters ('|') as table column separators. Although not documented, the tables appear not to contain comment or quotation characters. There are records containing the reserved pipe-character that will confuse the table parser. For these records, the pipe-character is replaced with a dash character ('-').

In addition, while reading the tables as text files, this package attempts to decode the text as UTF8. Unfortunately, this process appears to be platform-dependent, and may therefore result in different end-results on different platforms. This problem only seems to occur for characters that are listed as 'control characters' under UTF8. This will have consequences for reproducibility, but only if you build search queries that look for such special characters. It is therefore advised to stick to common (non-accented) alpha-numerical characters in your searches, for the sake of reproducibility.

Use 'suppressMessages' to suppress the progress report.


Returns NULL invisibly.


Pepijn de Vries


## Not run: 
## This example will only work properly if 'dir' points to an existing directory
## with the raw tables from the ECOTOX database. This function will be called
## automatically after a call to 'download_ecotox_data()'.
test <- check_ecotox_availability()
if (test) {
  files   <- attributes(test)$files[1,]
  dir     <- gsub(".sqlite", "", files$database, fixed = T)
  path    <- files$path
  if (dir.exists(file.path(path, dir))) {
    ## This will build the database in your temp directory:
    build_ecotox_sqlite(source = file.path(path, dir), destination = tempdir())

## End(Not run)

ECOTOXr documentation built on Nov. 17, 2022, 5:07 p.m.