load_table_from_sql_f: Load data from one SQL table to another
In PHSKC-APDE/apde: Common functions used in APDE's work

load_table_from_sql_f

R Documentation

Load data from one SQL table to another

Description

load_table_from_sql loads data from one SQL table to another using specified variables or a YAML config file.

Usage

load_table_from_sql_f(
  conn,
  server = NULL,
  config = NULL,
  config_url = NULL,
  config_file = NULL,
  from_schema = NULL,
  from_table = NULL,
  to_schema = NULL,
  to_table = NULL,
  archive_schema = NULL,
  archive_table = NULL,
  truncate = F,
  truncate_date = F,
  auto_date = F,
  date_var = "from_date",
  date_cutpoint = NULL,
  drop_index = T,
  test_schema = NULL
)

Arguments

`conn`	SQL server connection created using `odbc` package.
`server`	Name of server being used (only applies if using a YAML file). Useful if the same table is loaded to multiple servers but with different names or schema.
`config`	Name of object in global environment that contains configuration information. Use one of `config`, `config_url`, or `config_file`. Should be in a YAML format with at least the following variables: from_schema, from_table, to_schema and to_table. All mandatory variables should all be nested under the server name if applicable, other variables should not.
`config_url`	URL of a YAML config file. Use one of `config`, `config_url`, or `config_file`. Note the requirements under `config`.
`config_file`	File path of a YAML config file. Use one of `config`, `config_url`, or `config_file`. Note the requirements under `config`.
`from_schema`	Name of the schema that data will be loaded from (if not using YAML input).
`from_table`	Name of the table that data will be loaded from (if not using YAML input).
`to_schema`	Name of the schema that data will be loaded to (if not using YAML input).
`to_table`	Name of the table that data will be loaded to (if not using YAML input).
`archive_schema`	Name of the schema where archived data live (if not using YAML input). Must be provided if using truncate_date (either directly or from the YAML).
`archive_table`	Name of the table where archived data live (if not using YAML input). Must be provided if using truncate_date (either directly or from the YAML).
`truncate`	Truncate existing table prior to loading. Default is FALSE.
`truncate_date`	Truncate existing table at a certain date. Assumes existing table has older data. Must provide archive_schema and archive_table values (either directly or from the YAML) if using this option because existing data needs to go somewhere.
`auto_date`	Attempt to use from_table data to ascertain the date to use for truncation cutoff.
`date_var`	Name of the date variable
`date_cutpoint`	Date at which to truncate existing data (if not using YAML input or auto_date).
`drop_index`	Drop any existing indices prior to loading data. This can speed loading times substantially. Use `add_index` to restore the index after. Default is TRUE.
`test_schema`	Write to a temporary/development schema when testing out table creation. Will use the to_schema (specified or in the YAML file) to make a new table name of {to_schema}_{to_table}. Schema must already exist in the database. Most useful when the user has an existing YAML file and does not want to overwrite it. Only 5,000 rows will be loaded to each table (4000 from the archive table if it exists and 1000 from the from_table). Default is NULL.

Details

This function loads data file to an already existing SQL table using specified variables or a YAML configuration file. The function is essentially a shortcut for SQL code to truncate a table and insert new rows, with added functionality for truncating at a certain date and loading from an archive table. Users can specify some input functions (e.g., to_table) and rely on the config file for the rest of the necessary information. For all arguments that could be specified or come from a YAML file, the hierarchy is specified > argument under server in YAML > argument not under server in YAML.

Example YAML file with no server or individual years

(Assume the indentation is appropriate)

from_schema: stage
from_table: mcaid_elig
to_schema: final
to_table: mcaid_elig
*optional other components like a qa_schema and qa_table, index name, vars, etc.*

Example YAML file with servers (phclaims, hhsaw) and individual years

(Assume the indentation is appropriate)

phclaims:
    from_schema: stage
    from_table: mcaid_elig
    to_schema: final
    to_table: mcaid_elig
hhsaw:
    from_schema: claims
    from_table: stage_mcaid_elig
    to_schema: claims
    to_table: final_mciad_elig
*optional other components like a qa_schema and qa_table, index name, vars, etc.*

Examples

## Not run: 
load_table(conn = db_claims, server = "hhsaw", config = load_config)
load_table(conn = db_claims, server = "phclaims", 
config_url = "https://raw.githubusercontent.com/PHSKC-APDE/claims_data/load_mcaid_raw.R",
overall = F, ind_yr = T)

## End(Not run)

PHSKC-APDE/apde documentation built on April 14, 2025, 10:46 a.m.