create_table: Create a SQL table

View source: R/create_table.R

create_tableR Documentation

Create a SQL table

Description

create_table creates a SQL table using specified variables or a YAML config file.

Usage

create_table(
  conn,
  server = NULL,
  config = NULL,
  config_url = NULL,
  config_file = NULL,
  to_schema = NULL,
  to_table = NULL,
  vars = NULL,
  overwrite = T,
  external = F,
  ext_data_source = NULL,
  ext_schema = NULL,
  ext_object_name = NULL,
  overall = T,
  ind_yr = F,
  years = NULL,
  years_vars = NULL,
  test_schema = NULL
)

Arguments

conn

SQL server connection created using odbc package.

server

Name of server being used (only applies if using a YAML file). Useful if the same table is loaded to multiple servers but with different names or schema.

config

Name of object in global environment that contains configuration information. Use one of config, config_url, or config_file. Should be in a YAML format with at least the following variables: to_schema, to_table, vars. to_schema and to_table should be nested under the server name if applicable. The vars variable should not be listed under the server name but should list all variables in the table along with data type, e.g., id_apde VARCHAR(20). If a server is being specified, the to_schema and to_table variables should be nested under that server name.

config_url

URL of a YAML config file. Use one of config, config_url, or config_file. Note the requirements under config.

config_file

File path of a YAML config file. Use one of config, config_url, or config_file. Note the requirements under config.

to_schema

Name of the schema to apply the index to (if not using YAML input).

to_table

Name of the table to apply the index to (if not using YAML input).

vars

Named vector of variables to create in the table (if not using YAML input). Should take the format c("a" = "VARCHAR(255)", "b" = "DATE", "c" = "INTEGER").

overwrite

Drop table first before creating it, if it exists. Default is TRUE.

external

Create external table. This option requires specifying data source details in the ext_data_source, ext_schema, and ext_object_name variables (either when calling the function or in the YAML file). If in the YAML file, these variables should not be nested under a server name. Default is FALSE.

ext_data_source

Name of the external data source (if not using YAML input). This pointer should already be established in the SQL database.

ext_schema

Name of the external data schema (if not using YAML input).

ext_object_name

Name of the external data table (if not using YAML input).

overall

Create single table instead of a table for each calendar year. Mutually exclusive with ind_yr option. Default is TRUE.

ind_yr

Create multiple tables with the same core structure, one for each calendar year, with a year suffix on each table name (e.g., mcaid_elig_2014). Mutually exclusive with overall option. If using this option, the list of years should be provided via the years argument or a years variable in the YAML file. If a given year has additional specific fields, these should be listed in the vars variable, nested under the calendar year (see example in description). Default is FALSE.

years

Vector of individual years to make tables for (if not using YAML input).

years_vars

List of named vectors of additional variables that are specific to a given year (if not using YAML input). Should take the format list("2014" = c("DUAL_ELIG" = "VARCHAR(255)", "TPL_FULL_FLAG" = "VARCHAR(255)"), "2016" = c("SECONDARY_RAC_CODE" = "INTEGER", "SECONDARY_RAC_NAME" = "VARCHAR(255)")) where the name matches the years to load..

test_schema

Use a temporary/development schema to test out table creation. Will use the to_schema (specified or in the YAML file) to make a new table name of {to_schema}_{to_table}. Schema must already exist in the database. Most useful when the user has an existing YAML file and does not want to overwrite it. Default is NULL.

Details

This function creates tables in a SQL database using specified variables or a YAML configuration file. Users can specify some input functions (e.g., to_table) and rely on the config file for the rest of the necessary information. For all arguments that could be specified or come from a YAML file, the hierarchy is specified > argument under server in YAML > argument not under server in YAML.

Example YAML file with no server or individual years

(Assume the indentation is appropriate)

to_schema: raw
to_table: mcaid_elig
*optional other components like a qa_schema and qa_table, file path to data, index name, etc.*
vars:
    CLNDR_YEAR_MNTH: INTEGER
    MEDICAID_RECIPIENT_ID: VARCHAR(255)
    RPRTBL_RAC_CODE: INTEGER
    RPRTBL_RAC_NAME: VARCHAR(255)
    RPRTBL_BSP_GROUP_CID: INTEGER
    RPRTBL_BSP_GROUP_NAME: VARCHAR(255)
    FROM_DATE: DATE
    TO_DATE: DATE

Example YAML file with servers (phclaims, hhsaw) and individual years

(Assume the indentation is appropriate)

phclaims:
    to_schema: raw
    to_table: mcaid_elig
hhsaw:
    to_schema: raw
    to_table: mciad_elig
*optional other components like a qa_schema and qa_table, file path to data, index name, etc.*
vars:
    CLNDR_YEAR_MNTH: INTEGER
    MEDICAID_RECIPIENT_ID: VARCHAR(255)
    RPRTBL_RAC_CODE: INTEGER
    RPRTBL_RAC_NAME: VARCHAR(255)
    RPRTBL_BSP_GROUP_CID: INTEGER
    RPRTBL_BSP_GROUP_NAME: VARCHAR(255)
    FROM_DATE: DATE
    TO_DATE: DATE
years:
    - 2014
    - 2015
    - 2016
2014:
    vars:
        DUAL_ELIG: VARCHAR(255)
        TPL_FULL_FLAG: VARCHAR(255)
2016:
    vars:
        SECONDARY_RAC_CODE: INTEGER
        SECONDARY_RAC_NAME: VARCHAR(255)

Examples

## Not run: 
create_table(conn = db_claims, server = "hhsaw", config = load_config)
create_table(conn = db_claims, server = "phclaims", 
config_url = "https://raw.githubusercontent.com/PHSKC-APDE/claims_data/load_mcaid_raw.R",
overall = F, ind_yr = T)

## End(Not run)


PHSKC-APDE/apde documentation built on April 14, 2025, 10:46 a.m.