library(DatabaseConnector)
This vignette describes how to use the DatabaseConnector
package through the DBI
and dbplyr
interfaces. It assumes you already know how to create a connection as described in the 'Connecting to a database' vignette.
All functions of the DatabaseConnector
DBI
interface apply SQL translation, thus making it an interface to a virtual database platform speaking OHDSISql as defined in SqlRender
.
We can use the dbConnect()
function, which is equivalent to the connect()
function:
connection <- dbConnect( DatabaseConnectorDriver(), dbms = "postgresql", server = "localhost/postgres", user = "joe", password = "secret" )
writeLines("Connecting using PostgreSQL driver")
dbIsValid(conn)
TRUE
Querying and executing SQL can be done through the usual DBI
functions. SQL statements are assumed to be written in 'OhdsiSql', a subset of SQL Server SQL (see the SqlRender
package for details), and are automatically translated to the appropriate SQL dialect. For example:
dbGetQuery(connection, "SELECT TOP 3 * FROM cdmv5.person")
data.frame(person_id = c(1,2,3), gender_concept_id = c(8507, 8507, 8507), year_of_birth = c(1975, 1976, 1977))
Or:
res <- dbSendQuery(connection, "SELECT TOP 3 * FROM cdmv5.person") dbFetch(res)
data.frame(person_id = c(1,2,3), gender_concept_id = c(8507, 8507, 8507), year_of_birth = c(1975, 1976, 1977))
dbHasCompleted(res)
TRUE
dbClearResult(res) dbDisconnect(res)
We can create a table based on a DatabaseConnector
connection. The inDatabaseSchema()
function allows us to use standard databaseSchema
notation promoted by SqlRender:
library(dpylr) person <- tbl(connection, inDatabaseSchema("cdmv5", "person")) person
data.frame(person_id = c(1,2,3), gender_concept_id = c(8507, 8507, 8507), year_of_birth = c(1975, 1976, 1977))
we can apply the usual dplyr
syntax:
person %>% filter(gender_concept_id == 8507) %>% count() %>% pull()
1234
The dbplyr
package does not support date functions, but DatabaseConnector
provides the dateDiff()
, dateAdd()
, eoMonth()
, dateFromParts()
, year()
, month()
, and day()
functions that will correctly translate to various data platforms:
observationPeriod <- tbl(connection, inDatabaseSchema("cdmv5", "observation_period")) observationPeriod %>% filter( dateDiff("day", observation_period_start_date, observation_period_end_date) > 365 ) %>% count() %>% pull()
987
Because of the many idiosyncrasies in how different dataplatforms store and transform table and field names, it is currently not possible to use any names that would require quotes. So for example the names person
, person_id
, and observation_period
are fine, but Person ID
and Obs. Period
are not. In general, it is highly recommend to use lower case snake_case for database table and field names.
The DBI
interface uses temp table emulation on those platforms that do not support real temp tables. This does require that for those platforms the user specify a tempEmulationSchema
, preferably using
option(sqlRenderTempEmulationSchema = "a_schema")
Where "a_schema"
refers to a schema where the user has write access. If we know we will need temp tables, we can use the assertTempEmulationSchemaSet()
to verify this option has been set. This function will throw an error if it is not set, but only if the provided dbms is a platform that requires temp table emulation.
In OHDSISql
, temp tables are referred to using a '#' prefix. For example:
dbWriteTable(connection, "#temp", cars)
message("Inserting data took 0.053 secs")
The copy_to
function creates a temp table:
carsTable <- copy_to(connection, cars)
writeLines("Created a temporary table named #cars")
The compute()
function also creates a temp table, for example:
tempTable <- person %>% filter(gender_concept_id == 8507) %>% compute()
message("Created a temporary table named #dbplyr_001")
Emulated temp tables are not really temporary, and therefore have to be removed when no longer needed. A convenient way to drop all emulated temp tables created so far in an R session is using the dropEmulatedTempTables()
function:
dropEmulatedTempTables(connection)
In our example, this does not do anything because were using a PostgreSQL server, which does natively support temp tables.
We can use the dbDisconnect()
function, which is equivalent to the disconnect()
function:
dbDisconnect(connection)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.