build_db: building a RDS database from SQL source data
In socioskop/grit: Function library to handle Danish health registry data efficiently for prediction purposes.

The health authority uses access-restricted SQL tables to deliver data to users. This is probably a link to underlying SAS data. But access is also restricted by server side memory. So pulling big data chunks often gives a 'java heap memory' error. To overcome this, we slice big data-sets in chunks by 1- or 2-level prefixes of table keys (cpr or record id) and then store these chunks in a local database made up of RDS files in a structure resembling the original SQL table structure. This solves the problem with java heap memory limits and it also enables us to have parallel access to data (which is not really feasible with one SQL connection).

build_db(
  projct,
  db_dir,
  prefix,
  prefix.add = NULL,
  mode,
  drv = RJDBC::JDBC("com.microsoft.sqlserver.jdbc.SQLServerDriver", "c:/sqljdbc42.jar"),
  conn_def = "jdbc:sqlserver://fm-sql;databaseName=Forsker;integratedSecurity=true"
)

`projct`	4-digit project number, assuming first four are zeros
`db_dir`	path to where local db should be written. Will overwrite db_dir
`prefix`	the initial number of characters to use for data slicing. Will be extended by one, if a first character in a key is constant across the table
`prefix.add`	add names of any tables that will need an extra character on the slicing prefix
`mode`	'clear' will overwrite everything again, 'fill' will fill those chunks that are expected based on SQL source, but missing in local db. If really big data generates java errors even when slicing, then this option will work for iterative db-building in a loop. Check log and data index carefully when using 'fill' option.
`drv`	Driver for the sql connection. Is a long URL with MS SQL parameters you get with data delivery.
`conn_def`	the 'conn' parameter in DBI-requests. This is the definition of the SQL connection. Also acquired when you get data from the Health Data Authority

this function will return a RDS database in a folder-file structure resembling the original SQL tables, but with all data content slices. Will also add a log file in the db_dir.

socioskop/grit documentation built on Dec. 23, 2021, 3:30 a.m.

socioskop/grit index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

socioskop/grit
Function library to handle Danish health registry data efficiently for prediction purposes.

build_db: building a RDS database from SQL source data
In socioskop/grit: Function library to handle Danish health registry data efficiently for prediction purposes.

Description

Usage

Arguments

Value

Related to build_db in socioskop/grit...

R Package Documentation

Browse R Packages

We want your feedback!

socioskop/grit Function library to handle Danish health registry data efficiently for prediction purposes.

build_db: building a RDS database from SQL source data In socioskop/grit: Function library to handle Danish health registry data efficiently for prediction purposes.

Description

Usage

Arguments

Value

Related to build_db in socioskop/grit...

R Package Documentation

Browse R Packages

We want your feedback!

socioskop/grit
Function library to handle Danish health registry data efficiently for prediction purposes.

build_db: building a RDS database from SQL source data
In socioskop/grit: Function library to handle Danish health registry data efficiently for prediction purposes.