rs_cols_upsert_table: Upsert Redshift table by specifying a column list

Description Usage Arguments See Also Examples

View source: R/upsert2.R

Description

Upload a table to S3 and then load to Redshift, replacing the target value in existing rows that have the same keys, and inserting rows with new keys. New rows must match structure and column ordering of existing Redshift table.

Usage

1
2
3
4
5
6
7
8
rs_cols_upsert_table(dat, dbcon, table_name, values, keys, split_files,
  bucket = Sys.getenv("AWS_BUCKET_NAME"),
  region = Sys.getenv("AWS_DEFAULT_REGION"),
  access_key = Sys.getenv("AWS_ACCESS_KEY_ID"),
  secret_key = Sys.getenv("AWS_SECRET_ACCESS_KEY"),
  session_token = Sys.getenv("AWS_SESSION_TOKEN"),
  iam_role_arn = Sys.getenv("AWS_IAM_ROLE_ARN"), wlm_slots = 1,
  additional_params = "")

Arguments

dat

a data frame

dbcon

an RPostgres/RJDBC connection to the redshift server

table_name

the name of the table to update/insert

values

the columns that will be updated

keys

this optional vector contains the variables by which to upsert. If not defined, the upsert becomes an append.

split_files

optional parameter to specify amount of files to split into. If not specified will look at amount of slices in Redshift to determine an optimal amount.

bucket

the name of the temporary bucket to load the data. Will look for AWS_BUCKET_NAME on environment if not specified.

region

the region of the bucket. Will look for AWS_DEFAULT_REGION on environment if not specified.

access_key

the access key with permissions for the bucket. Will look for AWS_ACCESS_KEY_ID on environment if not specified.

secret_key

the secret key with permissions for the bucket. Will look for AWS_SECRET_ACCESS_KEY on environment if not specified.

session_token

the session key with permissions for the bucket, this will be used instead of the access/secret keys if specified. Will look for AWS_SESSION_TOKEN on environment if not specified.

iam_role_arn

an iam role arn with permissions fot the bucket. Will look for AWS_IAM_ROLE_ARN on environment if not specified. This is ignoring access_key and secret_key if set.

wlm_slots

amount of WLM slots to use for this bulk load http://docs.aws.amazon.com/redshift/latest/dg/tutorial-configuring-workload-management.html

additional_params

Additional params to send to the COPY statement in Redshift

See Also

https://docs.aws.amazon.com/redshift/latest/dg/merge-specify-a-column-list.html

http://docs.aws.amazon.com/redshift/latest/dg/tutorial-configuring-workload-management.html

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
library(DBI)

a=data.frame(a=seq(1,10000), b=seq(10000,1))
n=head(a,n=5000)
n$b=n$a
nx=rbind(n, data.frame(a=seq(99999:104000), b=seq(104000:99999)))

## Not run: 
con <- dbConnect(RPostgres::Postgres(), dbname="dbname",
host='my-redshift-url.amazon.com', port='5439',
user='myuser', password='mypassword',sslmode='require')

rs_cols_upsert_table(df=nx, dbcon=con, table_name='testTable',
bucket="my-bucket", split_files=4, values=c('b'), keys=c('a'))

## End(Not run)

sicarul/redshiftTools documentation built on July 23, 2021, 2:59 a.m.