ctgov_create_duckdb: Create DuckDB Connection Object

View source: R/load.R

ctgov_create_duckdbR Documentation

Create DuckDB Connection Object

Description

This function creates a local DuckDB version of the full CTrialsGov database from the pipe-deliminated flat files. The resulting connection returned by the function can be queried directly or used with ctgov_create_data to create a more de-normalized version for use with the other functions contained in this package.

Usage

ctgov_create_duckdb(basedir, dbdir = "ctgov_db_all", verbose = TRUE)

Arguments

basedir

character giving the location that the flat-file pipe files have been unziped

dbdir

Location for database files. Should be a path to an existing directory in the file system.

verbose

logical flag; should progress messages be printed?; defaults to TRUE

Details

The function requires downloading and unzipping the current database dump files found at https://aact.ctti-clinicaltrials.org/pipe_files. Given their large size (around 1.4GB as of June 2022), we find it preferrable to download the file directly through a browser or other command line tool rather than through the R native functions, which are not well-suited to to restarting a partial download.

Value

a path to the duckDB database

Author(s)

Taylor B. Arnold, taylor@dvlab.io


presagia-analytics/ctrialsgov documentation built on March 25, 2024, 2:10 p.m.