dwh_table_upsert: Upsert DWH Table

Description Usage Arguments Examples

View source: R/redshift_manipulation.R

Description

Upsert (Update/Insert) data using data.frame and provided keys. If an existing row matches the key in the data frame sent, it'll be deleted before the insert. Afterwards, the insert will be executed, this is considered an update in this context, this is the fastest way to do upserts with Amazon Redshift.

Usage

1
2
3
4
5
dwh_table_upsert(df, table_name, keys, split_num = 64,
  bucket = Sys.getenv("STAGINGBUCKET_NAME"),
  region = Sys.getenv("STAGINGBUCKET_REGION"),
  iam_role_arn = Sys.getenv("REDSHIFT_IAM_ROLE"), access_key = "",
  secret_key = "", pcon = NULL)

Arguments

df

The data.frame with the data to insert, tibbles are also supported.

table_name

The name of the table in Amazon Redshift

keys

Vector with all keys to join by to do the update part of the upsert

split_num

The number of files to split the data.frame into, it should be a multiple of the slices in the DWH, you can check the current number consulting stv_slices https://docs.aws.amazon.com/redshift/latest/dg/r_STV_SLICES.html

bucket

The S3 bucket on which to dump the data before sending to Amazon Redshift

region

The region where the bucket resides

iam_role_arn

The role that is set in Amazon Redshift to access the S3 bucket (You only need this or the access/secret keys)

access_key

The access key that is set in Amazon Redshift to access the S3 bucket (You only need this or the IAM Role)

secret_key

The secret key that is set in Amazon Redshift to access the S3 bucket (You only need this or the IAM Role)

pcon

Optionally, use an existing connection, if not, will start a temporary connection to use

Examples

1
2
a=data.frame(column_a=c(1,2,3), column_b=c('a','b','c'), column_c=c('x','y','z'))
dwh_table_upsert(a, 'test_table', c('column_a','column_b'))

auth0/rauth0 documentation built on July 3, 2021, 4:11 p.m.