View source: R/load_table_from_file.R
load_table_from_file | R Documentation |
load_table_from_file
loads a file data to a SQL table using specified variables or a YAML config file.
load_table_from_file(
conn,
server = NULL,
overall = T,
ind_yr = F,
years = NULL,
combine_yr = F,
config = NULL,
config_url = NULL,
config_file = NULL,
to_schema = NULL,
to_table = NULL,
server_path = NULL,
db_name = NULL,
azure = F,
azure_uid = NULL,
azure_pwd = NULL,
file_path = NULL,
field_term = NULL,
row_term = NULL,
first_row = 2,
truncate = T,
drop_index = T,
tablock = F,
test_schema = NULL,
use_sys = F
)
conn |
SQL server connection created using |
server |
Name of server being used (only applies if using a YAML file).
Useful if the same table is loaded to multiple servers but with different names
or schema. Note that this is different from the |
overall |
Load a single table instead of a table for each calendar year.
Mutually exclusive with |
ind_yr |
Load multiple tables, one for each calendar year, with a year suffix
on each table name (e.g., mcaid_elig_2014). Mutually exclusive with |
years |
Vector of individual years to make tables for (if not using YAML input). |
combine_yr |
Union year-specific files into a single table. Only applies
if ind_yr = T. Default is |
config |
Name of object in global environment that contains configuration
information. Use one of |
config_url |
URL of a YAML config file. Use one of |
config_file |
File path of a YAML config file. Use one of |
to_schema |
Name of the schema that data will be loaded to (if not using YAML input). |
to_table |
Name of the table that data will be loaded to (if not using YAML input). |
server_path |
Name of the SQL server to connect to (if not using YAML input). If using Azure, only seems to work if you specify an existing DSN connection. |
db_name |
Name of the database to use (if not using YAML input). |
azure |
Flag to indicate data are being loaded to an Azure SQL server. Default is |
azure_uid |
Username for connecting to Azure Active Directory. Only use if azure = T. |
azure_pwd |
Password for connecting to Azure Active Directory. Only use if azure = T. |
file_path |
File path of data to be loaded (if not using YAML input). If
ind_yr = T, this should be a named vector with the format
|
field_term |
Field terminator in the data (if not using YAML input). If
using ind_yr = T and the terminator differs between calendar years, this should
be a named vector with the format |
row_term |
Row terminator in the data (if not using YAML input). If
using ind_yr = T and the terminator differs between calendar years, this should
be a named vector with the format |
first_row |
Row number of the first line of data (if not using YAML input).
Default is |
truncate |
Truncate existing table prior to loading. Default is |
drop_index |
Drop any existing indices prior to loading data. This can speed
loading times substantially. Use |
tablock |
Logical ( |
test_schema |
Write to a temporary/development schema when testing out table creation.
Will use the to_schema (specified or in the YAML file) to make a new table name of
{to_schema}_{to_table}. Schema must already exist in the database. Most useful
when the user has an existing YAML file and does not want to overwrite it.
Only 1,000 rows will be loaded to each table. Default is |
use_sys |
If the sys package is installed, use this to call BCP and see a more
informative interface in the R console. Helpful for debugging when BSP doesn't work.
Default is |
This function loads a data file to an already existing SQL table using specified variables or a YAML configuration file. The function is essentially a wrapper for the SQL bulk copy program (BCP) utility using integrated authorization. Users can specify some input functions (e.g., to_table) and rely on the config file for the rest of the necessary information. For all arguments that could be specified or come from a YAML file, the hierarchy is specified > argument under server in YAML > argument not under server in YAML. Note that arguments that should not vary between servers (e.g., row_terminator) should not be listed under a server in the YAML file. Note: This function does not work to load to an Azure SQL DB (as of May 2022)
(Assume the indentation is appropriate)
to_schema: raw to_table: mcaid_elig *optional other components like a qa_schema and qa_table, index name, vars, etc.* server_path: KCITSQLABCD51 db_name: PHClaims file_path: //path123/importdata/Data/kc_elig_20210519.txt field_term: \t row_term: \n
(Assume the indentation is appropriate)
phclaims: to_schema: raw to_table: mcaid_elig server_path: KCITSQLABCD51 db_name: PHClaims hhsaw: to_schema: raw to_table: mciad_elig server_path: kcitazxyz20.database.windows.net db_name: hhs_analytics_workspace *optional other components like a qa_schema and qa_table, index name, vars, etc.* field_term: \t row_term: \n years: 2014 2015 2016 2014: file_path: //path123/importdata/Data/kc_elig_2014.txt field_term: \| row_term: \r 2015: file_path: //path123/importdata/Data/kc_elig_2015.txt 2016: file_path: //path123/importdata/Data/kc_elig_2016.txt field_term: \0 row_term: \0
## Not run:
load_table(conn = db_claims, server = "hhsaw", config = load_config)
load_table(conn = db_claims, server = "phclaims",
config_url = "https://raw.githubusercontent.com/PHSKC-APDE/claims_data/load_mcaid_raw.R",
overall = F, ind_yr = T)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.