Description Usage Arguments Examples
View source: R/redshift_query_n.R
Retrieves unloaded redshift query components from an AWS library. The process is parallelised to speed it up. This means much larger datasets can be queried and compiled. If parallel = FALSE parameter is not needed.
1 2 |
sql.string |
SQL Query string |
conn |
Redshift database connection object |
bucket |
AWS bucket to store unload files |
transform.function |
function object detailing additional transformations to perform on the data |
parallel |
TRUE or FALSE, do you want to do parallel processing. Defaults to FALSE |
cores |
Cores to utilise, defaults to NULL, if NULL will detect cores on machine and utilise all of them. If parallel = FALSE parameter is not needed. |
package.list |
packages packages needed in the transformation function. If parallel = FALSE parameter is not needed. NB: If parallel = FALSE the package must be loaded in the local environment. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | # Firstly you'll need to set up AWS permissions
Sys.setenv("AWS_ACCESS_KEY_ID" = "key",
"AWS_SECRET_ACCESS_KEY" = "key",
"AWS_DEFAULT_REGION" = "key")
# Define SQL string for database query
# e.g.
sql.string <- "SELECT * FROM some.table"
# Setup a database connection object using redshift.connect
conn <- redshift.connect("connection string")
# You'll also need an existing bucket where you can deposit interim files
# e.g.
bucket <- "an-aws-bucket"
# A transformation function used to make additional alterations on the data
transform.function <- function(x) {data.table(x)}
# Decide if you want to do parallel processing, if no
parallel <- FALSE
# if yes
parallel <- TRUE
# If parallel = TRUE
# Define number of threads, if undefined it'll detect the number of cores on the system and use these.
cores <- 10
# If parallel = TRUE
# A list of packages that must be used in the transformation function
# e.g.
package.list <- "data.table"
# Execute the function
data <- redshift_query_n(sql.string, conn, bucket, cores, transform.function, package.list)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.