blocks_to_rowrecs_q | R Documentation |
Transform data facts from rows into additional columns using SQL and controlTable.
blocks_to_rowrecs_q(
tallTable,
keyColumns,
controlTable,
my_db,
...,
columnsToCopy = NULL,
tempNameGenerator = mk_tmp_name_source("mvtcq"),
strict = FALSE,
controlTableKeys = colnames(controlTable)[[1]],
checkNames = TRUE,
checkKeys = FALSE,
showQuery = FALSE,
defaultValue = NULL,
dropDups = TRUE,
temporary = FALSE,
resultName = NULL,
incoming_qualifiers = NULL,
outgoing_qualifiers = NULL,
executeQuery = TRUE
)
tallTable |
name of table containing data to be mapped (db/Spark data) |
keyColumns |
character list of column defining row groups |
controlTable |
table specifying mapping (local data frame) |
my_db |
db handle |
... |
force later arguments to be by name. |
columnsToCopy |
character list of column names to copy |
tempNameGenerator |
a tempNameGenerator from cdata::mk_tmp_name_source() |
strict |
logical, if TRUE check control table name forms |
controlTableKeys |
character, which column names of the control table are considered to be keys. |
checkNames |
logical, if TRUE check names |
checkKeys |
logical, if TRUE check keying of tallTable |
showQuery |
if TRUE print query |
defaultValue |
if not NULL literal to use for non-match values. |
dropDups |
logical if TRUE suppress duplicate columns (duplicate determined by name, not content). |
temporary |
logical, if TRUE make result temporary. |
resultName |
character, name for result table. |
incoming_qualifiers |
optional named ordered vector of strings carrying additional db hierarchy terms, such as schema. |
outgoing_qualifiers |
optional named ordered vector of strings carrying additional db hierarchy terms, such as schema. |
executeQuery |
logical, if TRUE execute the query and return result. |
This is using the theory of "fluid data"n (https://github.com/WinVector/cdata), which includes the principle that each data cell has coordinates independent of the storage details and storage detail dependent coordinates (usually row-id, column-id, and group-id) can be re-derived at will (the other principle is that there may not be "one true preferred data shape" and many re-shapings of data may be needed to match data to different algorithms and methods).
The controlTable defines the names of each data element in the two notations: the notation of the tall table (which is row oriented) and the notation of the wide table (which is column oriented). controlTable[ , 1] (the group label) cross colnames(controlTable) (the column labels) are names of data cells in the long form. controlTable[ , 2:ncol(controlTable)] (column labels) are names of data cells in the wide form. To get behavior similar to tidyr::gather/spread one builds the control table by running an appropriate query over the data.
Some discussion and examples can be found here: https://winvector.github.io/FluidData/FluidData.html and here https://github.com/WinVector/cdata.
wide table built by mapping key-grouped tallTable rows to one row per group
build_pivot_control_q
, blocks_to_rowrecs
if (requireNamespace("DBI", quietly = TRUE) &&
requireNamespace("RSQLite", quietly = TRUE)) {
my_db <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
# pivot example
d <- data.frame(meas = c('AUC', 'R2'), val = c(0.6, 0.2))
rquery::rq_copy_to(my_db,
'd',
d,
temporary = TRUE)
cT <- build_pivot_control_q('d',
columnToTakeKeysFrom= 'meas',
columnToTakeValuesFrom= 'val',
my_db = my_db)
tab <- blocks_to_rowrecs_q('d',
keyColumns = NULL,
controlTable = cT,
my_db = my_db)
qlook(my_db, tab)
DBI::dbDisconnect(my_db)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.