RedshiftQuery: Query data from a Redshift database

Description Arguments Value Author(s)

Description

This functions runs queries pn Redshift via the UNLOAD command. The query is unloaded to an s3 bucket and then read locally. It handles connecting, querying, reading from s3 and disconnecting. The query can either be passed directly as a string or as a path to a file containing a SQL statement.

The yaml file must contain host, dbname, s3 bucket name, user and password.

This function, as opposed to PostgreSQLQuery, is intended to run large queries that might return a large amount of data or take some time to complete.

Arguments

query

Character vector with length 1. Can be either a SQL query or a path to a text file containing a SQL query

dbID

The name of the yaml group containing the database credentials

s3ID

The name of the yaml group containing the s3 bucket credentials

yamlConfig

The path to the yaml file

acceleration

Use s3 bucket acceleration

parallel

Use parallel package for simultaneous connections to s3

Value

The function will return a data.table object with the results of the query. In the case of a timeout it will return a string with the path of the created files on the s3 bucket

Author(s)

Henrique Cabral, Ricardo Vladimiro and João Monteiro


rvladimiro/dafR documentation built on June 26, 2019, 4:37 a.m.