copy_to_aws: Copy a Local Data Frame to a DBI Backend on AWS.
In ds4ci/ds4ciMisc: Miscellaneous Functions for DS4CI Projects

Description Usage Arguments Details Value

This copy_to_aws() method assumes an AWS Redshift database with is accessable from S3. Unlike dbplyr::copy_to(), on which it is loosly modeled, it is optimized for large dataframes. It is essentially a bulk loader of R dataframes, or tbls, to a Redshift table - which may, or may not, already exist. For initial release table must exist!

1
2
3

copy_to_aws(con, df, schema = "public", tname = deparse(substitute(df)),
  s3_bucket, s3_folder = "ds4ci_temp", overwrite = FALSE, clear_s3 = TRUE,
  temporary = FALSE, identity_column = FALSE, types = NULL, ...)

`con`	A DBI, or odbc, connection to the AWS Redshift database to copy to
`df`	The data.frame or tbl_df to copy
`schema`	The schema in the datbase
`tname`	The table name in the schema
`s3_bucket`	The S3 bucket to use as a "buffer"
`s3_folder`	A folder in the bucket. Will be created if it doesn't exist already
`overwrite`	TRUE to truncate the table before `COPY`ing
`clear_s3`	TRUE deletes the temp s3 file before returning
`temporary`	Placeholder - not used in initial release
`identity_column`	Placeholder - not used in initial release
`types`	Placeholder - not used in initial release

copy_to_aws() leverages the Redshift COPY command of a pipe-delimited gzip'd file of the dataframe which it uploads to a specified S3 bucket and folder. The file is deleted if no error is thrown.

Assumptions: