copy_to_aws: Copy a Local Data Frame to a DBI Backend on AWS.

Description Usage Arguments Details Value

Description

This copy_to_aws() method assumes an AWS Redshift database with is accessable from S3. Unlike dbplyr::copy_to(), on which it is loosly modeled, it is optimized for large dataframes. It is essentially a bulk loader of R dataframes, or tbls, to a Redshift table - which may, or may not, already exist. For initial release table must exist!

Usage

1
2
3
copy_to_aws(con, df, schema = "public", tname = deparse(substitute(df)),
  s3_bucket, s3_folder = "ds4ci_temp", overwrite = FALSE, clear_s3 = TRUE,
  temporary = FALSE, identity_column = FALSE, types = NULL, ...)

Arguments

con

A DBI, or odbc, connection to the AWS Redshift database to copy to

df

The data.frame or tbl_df to copy

schema

The schema in the datbase

tname

The table name in the schema

s3_bucket

The S3 bucket to use as a "buffer"

s3_folder

A folder in the bucket. Will be created if it doesn't exist already

overwrite

TRUE to truncate the table before COPYing

clear_s3

TRUE deletes the temp s3 file before returning

temporary

Placeholder - not used in initial release

identity_column

Placeholder - not used in initial release

types

Placeholder - not used in initial release

Details

copy_to_aws() leverages the Redshift COPY command of a pipe-delimited gzip'd file of the dataframe which it uploads to a specified S3 bucket and folder. The file is deleted if no error is thrown.

Assumptions:

Value

if the COPY was successful, the number of rows copied. If unsuccessful, the Redshift error.


ds4ci/ds4ciMisc documentation built on May 15, 2019, 2:56 p.m.