The RAthena
package aims to make it easier to work with data stored in AWS Athena
. RAthena
package attempts to provide three levels of interacting with AWS Athena:
AWS Athena
backend utilising the AWS SDK paws
. This includes configuring AWS Athena Work Groups
to assuming different roles within AWS
when connecting to AWS Athena
.RAthena
, by providing a DBI
interface to AWS Athena
. Users are able to interact with AWS Athena
utilising familiar functions and methods they have used for other Databases from R.dplyr
is coming more popular, RAthena
aims to give dplyr
a seamless interface into AWS Athena
.RAthena
:As RAthena
utilising the python AWS SDK boto3
, Python 3+ is required. Please install Python 3+ either by Python or Python Anaconda. To install RAthena
:
# cran version install.packages("RAthena") # Dev version remotes::install_github("dyfanjones/RAthena")
Next is to install Python boto3
. This can be done either by RAthena
's installation method:
RAthena::install_boto()
Or pip method:
pip install boto3
If RAthena
doesn't pick up boto3
after using install_boto()
, please consider specifying the python environment.install_boto()
creates RAthena
environment. This is either a Python virtual environment or a conda environment depending on your system.
library(DBI) # Specify python conda environment and force reticulate to use it reticulate::use_condaenv("RAthena", required = TRUE) # Or specify python virtual environment and force reticulate to use it reticulate::use_virtualenv("RAthena", required = TRUE) con <- dbConnect(RAthena::athena())
Note: Python environments are not required if boto3
is either in the root Python or if R and Python are in their own environment (for example conda environment).
To help with users wishing to run RAthena
in a docker, a simple docker file has been created here. To set up the docker please refer to link. For demo purposes we will use the example docker and run it locally:
# build docker image docker build . -t rathena # start container with aws credentials passed from local docker run \ -e AWS_ACCESS_KEY_ID="$(aws configure get aws_access_key_id)" \ -e AWS_SECRET_ACCESS_KEY="$(aws configure get aws_secret_access_key)" \ -e AWS_SESSION_TOKEN="$(aws configure get aws_session_token)" \ -e AWS_DEFAULT_REGION="$(aws configure get region)" \ -it rathena
When running RAthena
in the docker environment you might be required to let reticulate
know what python you are using.
reticulate::use_python("/usr/bin/python3") library(DBI) con <- dbConnect(RAthena::athena(), s3_staging_dir = "s3://mybucket/")
library(DBI) library(RAthena) con <- dbConnect(athena()) # list all current work groups in AWS Athena list_work_groups(con) # Create a new work group create_work_group(con, "demo_work_group", description = "This is a demo work group", tags = tag_options(key= "demo_work_group", value = "demo_01"))
library(DBI) con <- dbConnect(RAthena::athena()) # Get metadata dbGetInfo(con) # $profile_name # [1] "default" # # $s3_staging # [1] ######## NOTE: Please don't share your S3 bucket to the public # # $dbms.name # [1] "default" # # $work_group # [1] "primary" # # $poll_interval # NULL # # $encryption_option # NULL # # $kms_key # NULL # # $expiration # NULL # # $region_name # [1] "eu-west-1" # # $boto3 # [1] "1.11.5" # # $RAthena # [1] "1.7.1" # create table to AWS Athena dbWriteTable(con, "iris", iris) dbGetQuery(con, "select * from iris limit 10") # Info: (Data scanned: 860 Bytes) # sepal_length sepal_width petal_length petal_width species # 1: 5.1 3.5 1.4 0.2 setosa # 2: 4.9 3.0 1.4 0.2 setosa # 3: 4.7 3.2 1.3 0.2 setosa # 4: 4.6 3.1 1.5 0.2 setosa # 5: 5.0 3.6 1.4 0.2 setosa # 6: 5.4 3.9 1.7 0.4 setosa # 7: 4.6 3.4 1.4 0.3 setosa # 8: 5.0 3.4 1.5 0.2 setosa # 9: 4.4 2.9 1.4 0.2 setosa # 10: 4.9 3.1 1.5 0.1 setosa
library(dplyr) athena_iris <- tbl(con, "iris") athena_iris %>% select(species, sepal_length, sepal_width) %>% head(10) %>% collect() # Info: (Data scanned: 860 Bytes) # # A tibble: 10 x 3 # species sepal_length sepal_width # <chr> <dbl> <dbl> # 1 setosa 5.1 3.5 # 2 setosa 4.9 3 # 3 setosa 4.7 3.2 # 4 setosa 4.6 3.1 # 5 setosa 5 3.6 # 6 setosa 5.4 3.9 # 7 setosa 4.6 3.4 # 8 setosa 5 3.4 # 9 setosa 4.4 2.9 # 10 setosa 4.9 3.1
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.