knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE )
The {brickster}
package connects to a Databricks workspace is two ways:
It's recommended to use option (1) when using {brickster}
interactively, if you need to run code via an automated process the only option currently is (2).
{brickster}
will automatically detect when a session has Posit Workbench managed Databricks OAuth credentials enabled. For more information about this authentication flow see the section Posit Workbench Managed Databricks OAuth Credentials.
Personal Access Tokens can be generated in a few steps, for a step-by-step breakdown refer to the documentation.
Once you have a token you'll be able to store it alongside the workspace URL in an .Renviron
file. The .Renviron
is used for storing the variables, such as those which may be sensitive (e.g. credentials) and de-couple them from the code additional reading.
To get started add the following to your .Renviron
:
DATABRICKS_HOST
: The workspace URL
DATABRICKS_TOKEN
: Personal access token (not required if using OAuth U2M)
DATABRICKS_WSID
: The workspace ID (docs)
DATABRICKS_WSID
is only required for the RStudio IDE integration with the connection pane.
Example of entries in .Renviron
:
DATABRICKS_HOST=xxxxxxx.cloud.databricks.com DATABRICKS_TOKEN=dapi123456789012345678a9bc01234defg5 DATABRICKS_WSID=123123123123123
Note: Recommend creating an .Renviron
for each project. You can create .Renviron
within your user home directory if required.
Restarting your R session will allow those variable to be picked up via the {brickster}
package.
{brickster}
Authentication should now be possible without specifying the credentials in your R code. You can load {brickster}
and list the clusters within the workspace using db_cluster_list()
, to access the host/token use db_host()
/db_token()
respectively.
library(brickster) # using db_host() and db_token() to get credentials clusters <- db_cluster_list(host = db_host(), token = db_token())
All {brickster}
functions have their host/token parameters default to calling db_host()
/db_token()
therefore we can omit explicit calls to the functions.
# all host/token parameters default to db_host()/db_token() clusters <- db_cluster_list()
When using OAuth U2M authentication you don't define a token in .Renviron
and therefore db_token()
will return NULL
.
There are two methods that {brickster}
supports to simplify switching of credentials within an R project/session:
.Renviron
, each additional set of credentials is differentiated via a suffix (e.g. DATABRICKS_TOKEN_DEV
).databrickscfg
file (primary method in Databricks CLI)To differentiate between (1) and (2) the option use_databrickscfg
is used, the following example shows how to switch the session to use .databrickscfg
.
# will use the `DEFAULT` profile in `.databrickscfg` options(use_databrickscfg = TRUE) # values returned should be those in profile of `.databrickscfg` db_host() db_token()
The default behaviour is to read credentials from .Renviron
. If you wish to change this it's recommended to set the option within .Rprofile
so that it's set during initialization of the R session.
The db_profile
option controls which profiles credentials are returned by db_host()
/db_token()
/db_wsid()
.
Profiles enable you to switch contexts between:
Different workspaces (e.g. development or production)
Different permissions (e.g. admin or restricted user)
This behaviour works when using credentials specified in either .Renviron
or .databrickscfg
:
# using .Renviron db_host() # returns `DB_HOST` (.Renviron) # switch profile to 'prod' options(db_profile = "prod") db_host() # returns `DB_HOST_PROD` (.Renviron) # set back to default (NULL) options(db_profile = NULL) # use .databrickcfg options(use_databrickscfg = TRUE) db_host() # returns host from `DEFAULT` profile (.databrickscfg) options(db_profile = "prod") db_host() # returns host from `prod` profile in (.datarickscfg)
It is expected that profiles in .Renviron
will adhere to the same naming convention as default but add an additional suffix.
Here is an example of an .Renviron
file that has three profiles (default, dev, prod):
# default DATABRICKS_HOST=xxxxxxx.cloud.databricks.com DATABRICKS_TOKEN=dapixxxxxxxxxxxxxxxxxxxxxxxxx DATABRICKS_WSID=123123123123123 # dev DATABRICKS_HOST_DEV=xxxxxxx-dev.cloud.databricks.com DATABRICKS_TOKEN_DEV=dapixxxxxxxxxxxxxxxxxxxxxxxxx DATABRICKS_WSID_DEV=123123123123124 # prod DATABRICKS_HOST_PROD=xxxxxxx-prod.cloud.databricks.com DATABRICKS_TOKEN_PROD=dapixxxxxxxxxxxxxxxxxxxxxxxxx DATABRICKS_WSID_PROD=123123123123125
.databrickscfg
For details on configuring please refer to documentation from Databricks CLI.
There is only one {brickster}
specific feature and it is the inclusion of wsid
alongside host
/token
.
wsid
is used by the connections pane integration in RStudio as the underlying API's require it.
Posit Workbench has a managed Databricks OAuth credentials feature, which allows users to sign into a Databricks workspace from the home page of Workbench when launching a session and then access Databricks resources as their own identity. When in an RStudio Pro session running on Posit Workbench with managed Databricks OAuth credentials selected, {brickster}
functions using db_host()
/db_token()
respectively should just work without needing to specify any credentials in your R code. See the code below as an example.
library(brickster) db_cluster_list()
{brickster}
will automatically detect when a session has Workbench managed OAuth credentials and then use the workbench
profile defined in a .databrickscfg
file at the DATABRICKS_CONFIG_FILE
specified location. Workbench generates this .databrickscfg
file in a temporary directory and should not be modified directly.
To use an alternative .databrickscfg
file, a different profile
, an alternative env variable DATABRICKS_HOST
or set an env variable DATABRICKS_TOKEN
, launch an RStudio Pro session without the Databricks managed credentials box selected.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.