Description Usage Arguments Examples
View source: R/redshift_manipulation.R
Upsert (Update/Insert) data using data.frame and provided keys. If an existing row matches the key in the data frame sent, it'll be deleted before the insert. Afterwards, the insert will be executed, this is considered an update in this context, this is the fastest way to do upserts with Amazon Redshift.
1 2 3 4 5 | dwh_table_upsert(df, table_name, keys, split_num = 64,
bucket = Sys.getenv("STAGINGBUCKET_NAME"),
region = Sys.getenv("STAGINGBUCKET_REGION"),
iam_role_arn = Sys.getenv("REDSHIFT_IAM_ROLE"), access_key = "",
secret_key = "", pcon = NULL)
|
df |
The data.frame with the data to insert, tibbles are also supported. |
table_name |
The name of the table in Amazon Redshift |
keys |
Vector with all keys to join by to do the update part of the upsert |
split_num |
The number of files to split the data.frame into, it should be a multiple of the slices in the DWH, you can check the current number consulting stv_slices https://docs.aws.amazon.com/redshift/latest/dg/r_STV_SLICES.html |
bucket |
The S3 bucket on which to dump the data before sending to Amazon Redshift |
region |
The region where the bucket resides |
iam_role_arn |
The role that is set in Amazon Redshift to access the S3 bucket (You only need this or the access/secret keys) |
access_key |
The access key that is set in Amazon Redshift to access the S3 bucket (You only need this or the IAM Role) |
secret_key |
The secret key that is set in Amazon Redshift to access the S3 bucket (You only need this or the IAM Role) |
pcon |
Optionally, use an existing connection, if not, will start a temporary connection to use |
1 2 | a=data.frame(column_a=c(1,2,3), column_b=c('a','b','c'), column_c=c('x','y','z'))
dwh_table_upsert(a, 'test_table', c('column_a','column_b'))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.