knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This vignette is intended to help developers of the R package install needed dependencies.
While the odbc package contains some documentation on how to install and configure database drivers in vignette("setup")
, the documentation assumes that users are connecting to databases that have already been set up. In order to test package functionality, though, odbc sets up small example database deployments.
For the most part, this vignette assumes a MacOS system with aarch64 (e.g. M1 or M2) architecture. For Linux example code, see .github/workflows/db.yaml
, and for Windows, see .github/workflows/db-windows.yml
.
Posit employees have access to the Posit Professional drivers and the infrastructure used to build and test them in the rstudio/pro-drivers GitHub repository. The Posit Professional drivers are a set of drivers vendored from Magnitude Simba that support many of the most popular DBMS, including SQL Server, Oracle, Redshift, Databricks, Snowflake, etc. The repository they're developed in contains tooling to spin up a number of databases in docker containers to test against.
Note that Athena, Hive, Impala, MongoDB, and Oracle drivers are not available for macOS aarch64 (M1, M2, etc) at the time of writing.
The only documented installation method for these drivers on MacOS is via RStudio Desktop Pro. The Posit confluence page titled "[INTERNAL] Posit License Files for Employee Use" contains instructions for installing RStudio Desktop Pro. Once RStudio Pro is installed, install individual drivers as documented here.
Among other things, the rstudio/pro-drivers GitHub repository defines a MAKE
tool for setting up and tearing down databases in docker containers. Ensure that you have a docker daemon running (i.e. Docker Desktop open) and, if you're on macOS aarch64, have Settings > Use Rosetta for x86_64/amd64 emulation on Apple Silicon
enabled. To start a container for a given dbms
, run MAKE dist=none DB=dbms up
, and tear it down with MAKE dist=none DB=db down
. To see available dbms
options, see the names of .yml
files in the docker-compose
directory. Find connection details for each database in docker/shared/odbc.ini
.
DBMS-specific notes:
sqlserver
: Be sure to pass uid
and pwd
as arguments to dbConnect()
explicitly, even though they're in the odbc.ini
as well.On macOS aarch64, you will see a requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8)
warning when you start any of these containers. Feel free to ignore. :)
On MacOS, install PostgreSQL with:
brew install postgresql@14
You'll also need to install and configure the PostgreSQL driver psqlodbc
; see vignette("setup")
to learn more.
To launch a PostgreSQL server locally, run:
brew services start postgresql@14
Next, create a database called "test" (or by whatever name is in the entry Database
in your odbc.ini
file):
createdb test
At this point, you should be able to connect to PostgreSQL through the R interface. Connect with:
postgres <- dbConnect(odbc(), "PostgreSQL")
where "PostgreSQL"
is replaced with whatever DSN you've configured.
First, installing MySQL with Homebrew:
brew install mysql@8.2
MariaDB drivers are compatible with MySQL and are more easily installed than MySQL drivers themselves in most cases. To install the MariaDB driver:
brew install mariadb-connector-odbc
Then, link the MariaDB driver with your MySQL data source name. That is, with the driver name [MariaDB]
configuring your MariaDB install in odbcinst.ini
, the first lines of your odbc.ini
entry should look something like:
[MySQL] Driver = MariaDB
After running brew services start mysql
if needed, and confirming that the database is running with brew services info mysql
, you should be able to:
library(odbc) dbConnect(odbc(), "MySQL")
The second argument "MySQL"
refers to the data source name configured above.
To run Microsoft SQL Server on aarch64 (e.g. M1 or M2) MacOS, you will need:
If needed, install Docker with:
brew install --cask docker
The Docker Desktop app provides a GUI to monitor deployed Docker containers and lives in Docker.app > Show Package Contents > Contents > MacOS > Docker Desktop.app
.
To install the SQL Server ODBC driver and (optional) command line tool, use:
brew tap microsoft/mssql-release https://github.com/Microsoft/homebrew-mssql-release brew install microsoft/mssql-release/msodbcsql18 microsoft/mssql-release/mssql-tools18
The odbc.ini
entry should look something like:
[MicrosoftSQLServer] driver = ODBC Driver 18 for SQL Server Server = 127.0.0.1 port = 1433 Encrypt = no
With the driver name in odbcinst.ini
being [ODBC Driver 18 for SQL Server]
.
With docker and the needed driver installed, deploy the container with:
sudo docker run \ --platform linux/amd64 \ -e "ACCEPT_EULA=Y" \ -e 'MSSQL_SA_PASSWORD=BoopBop123!' \ -p 1433:1433 \ --name sql1 \ --hostname sql1 \ -d mcr.microsoft.com/mssql/server:2022-latest
The --platform
tag is correct for M1; if you see Status: Exited (1)
in Docker Desktop or a warning about incompatible architectures, navigate to Settings > General
and ensure that Use Rosetta for x86/amd64 emulation on Apple Silicon
is checked.
To connect via odbc, we need to pass the UID and PWD via the connection string; configuring those arguments via odbc.ini
is not permitted. With the container deployed as above, the connection arguments would be:
con <- dbConnect( odbc::odbc(), dsn = "MicrosoftSQLServer", uid = "SA", pwd = "BoopBop123!" )
Then do some configuration of the server to add a testuser and create the test database
To configure a server to add a testing user and create a test database:
# Add a test user, but currently unused dbExecute(con, "USE test") dbExecute(con, "EXEC sp_configure 'contained database authentication', 1") dbExecute(con, "RECONFIGURE") dbExecute(con, "alter database test set containment = partial") dbExecute(con, "CREATE USER testuser with password = 'BoopBop123!'") dbExecute(con, "GRANT CONTROL TO testuser") dbExecute(con, "DROP USER testuser") # Create a test database dbExecute(con, "CREATE DATABASE test")
On Linux, create a docker container with:
docker run -v "$(pwd)":"/opt/$(basename $(pwd))":delegated --security-opt=seccomp:unconfined --link sql1 -it rstudio/r-base:3.6.1-bionic /bin/bash
Then run:
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - #Ubuntu 18.04 curl https://packages.microsoft.com/config/ubuntu/18.04/prod.list > /etc/apt/sources.list.d/mssql-release.list apt-get update ACCEPT_EULA=Y apt-get install -y msodbcsql17 apt-get install -y unixodbc-dev
The resulting odbc.ini
file will look something like:
[MicrosoftSQLServer] driver = ODBC Driver 17 for SQL Server Server = sql1 port = 1433 Database = test
MacOS provides SQLite natively. With the SQLite odbc driver installed (see vignette("setup")
if needed), run:
library(odbc) dbConnect(odbc(), "SQLite")
The above example assumes the configured SQLite DSN is called "SQLite"
.
A huge pain.
Get the DB container:
docker login docker pull store/oracle/database-enterprise:12.2.0.1
Start the container:
docker run -d -it --name oracle_db -P store/oracle/database-enterprise:12.2.0.1
The -P
is important to set up the port forwarding from the docker container.
Then, query the port and edit the ports in tnsnames.ora
:
docker port oracle_db
The contents of snsnames.ora
should look like:
ORCLCDB=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=127.0.0.1)(PORT=32769)) (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=ORCLCDB.localdomain))) ORCLPDB1=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=127.0.0.1)(PORT=32769)) (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=ORCLPDB1.localdomain)))
Ensure that the current working directly is set appropriately.
Then, to add a new user to the database:
docker exec -it oracle_db bash -c "source /home/oracle/.bashrc; sqlplus SYS/Oradoc_db1 AS SYSDBA"
alter session set "_ORACLE_SCRIPT"=true; create user test identified by 12345; GRANT ALL PRIVILEGES TO TEST;
Finally, in R:
Sys.setenv("TNS_ADMIN" = getwd()) con <- dbConnect(odbc::odbc(), "OracleODBC-19")
To obtain a Snowflake account:
Install drivers using the instructions on the Snowflake documentation. For MacOS, feel free to ignore sections noting that iODBC is required.
On MacOS and Linux:
simba.snowflake.ini
file, likely in opt/snowflake/
. You will need sudo
access to edit it.DriverManagerEncoding
to UTF-16
.ODBCInstLib
entry to the path of your unixODBC install.SELECT SYSTEM$ALLOWLIST()
. In the output, copy the URL corresponding to "type":"SNOWFLAKE_DEPLOYMENT"
. This Deployment URL should look like a random string, an AWS region, and the snowflake API URL, e.g. wmc39401.us-east-1.snowflakecomputing.com
.The odbc.ini
file should look like:
[Snowflake] Driver = Snowflake Description = uid = <uid> server = <deployment-url>
The above assumes Driver = Snowflake
is configured in odbcinst.ini
. Then, connect by supplying the password as an argument:
con <- dbConnect(odbc(), dsn = "Snowflake", pwd = Sys.getenv("snowflakePass"))
In the above, the password has been set as Sys.setenv(snowflakePass = "actualPass")
.
If you're a Posit employee, you should have access to the rstudio/warehouse GitHub repository. Follow the instructions there to get access to the internal Redshift cluster. Access to the cluster is only enabled through Posit's internal Workbench instance, where the professional drivers will already be installed.
Note that Redshift is based on a modified version of PostgreSQL.
We need to install the RODBC package for benchmarking in the vignette vignette("benchmarks")
. The CRAN version of RODBC uses iODBC, so to use unixODBC we need to recompile it from source, specifying the odbc manager explicitly:
install.packages("RODBC", type = "source", INSTALL_opts="--configure-args='--with-odbc-manager=odbc'")
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.