mutate method and modified saveSQLDataFrame accordingly..join_union_prepare() function to have "mutate + rbind/join" work!saveSQLDataFrame returns SQLDataFrame constructed from new
database and table.makeSQLDataFrame function added to read text files into SQLite
database and return the SQLDataFrame constructed according to the
database, table, and dbkey. *_join and union functions with utility function to
avoid duplicate code.union()@dbconcatKey to be more efficient. .join_union_prepare for attaching database to current connection. rbind do take-one-at-a-time union. saveSQLDataFrame write unique index with dbkey(x). Save
ridx(x) if input SQLDataFrame comes from rbind (with non-null
ridx(x)).SQLDataFrame() constructor read ridx() if exists.dbtable() method. Only works for @tbldata with real
database on disk. Doesn't work for lazy tbl. apply()?? (e.g., print...) row/colSums, etc. .join_union_prepare.overwrite = FALSE check dbname or dbtable? (in makeSQLDataFrame() and
saveSQLDataFrame()). Use 2 arguments. add "index = TRUE" for saveSQLDF. mutate(). 2 additional tables to add to database: 1) dbkey info. serialize(dbkey(x), NULL) to convert to "raw", in "BLOB" (long binary??).
op_double that from join or union. saveSQLDataFrame? See
how to retrieve the SQL index table and make use of it. .fun() to do complicated calls like $src$con@dbname.
ident(dbtable(sdf)), ridx(sdf), normalizeRowindex(sdf) ...** union, lazy? dbplyr::union.tbl_lazy
printROWS for show method, subset again [1:nhead, ]
.printROWS(), rewrite..extract_tbl_rows_by_key,rbind1st trick: attach database as aux 2nd trick: insert into from aux. with queries. (lazy tbl from aux.) 3rd trick: unique database table, updated indexes with returned sdf.
saveSQLDataFrame create a new path is !exists()?
sdf[list(), ]
copy_to/db_insert_into/dbWriteTable ?? saveSQLDataFrame print message). col subsetting using "select(col1, col2, ...)", translate into col id and pass into .extractCOLS.
must realize the key columns. -- done!
Add ROWNAMES(SQLDataFrame) for composite key. by concatenating key columns. -- done!
(unite does not work on tbl_dbi directly, but paste works, or collect%>%unite)
Add row subsetting using [list(key1 = c(...), key2 == c(...), ), ], and
translate into concatenated key values, then translate into numeric
indexes (normalizeSingleBracketSubscript) for [ subsetting. -- done!
write out subsetted SQL table? Write out into a new path, new database (local?) file. -- done!
[, subsetting using filter(c1 = , c2 %in%, c3 > ...) or using "&"
message for [numeric, ], to suggest language filtering.
coercion of DataFrame/data.frame/ANY into SQLDataFrame.
numeric calculations, e.g., sum(), max(), min, row/colSums, row/colMeans, row/colMin, row/colMaxs, apply, ... need to define.
dbplyr::join.tbl_sql, left_join,
reordering... rbind,SQLDataFrame? Maybe not, now one
SQLDataFrame only corresponds to just one tbl_dbi object... Will
there be any need for rbinding data from different database table?
inst/test.db? only keep "colDatal" and rename as "colData"?? update testthat, examples... -- done!
extractCOLS,SQLDataFrame, add generic in S4Vectors, update
DelayedDataFrame and SQLDataFrame...3/5/2019
- modify the "state" table, for unique "region+population".
- add "union" method for SDF, which returns union with unique rows with automatic sorting.
- reimplement "rbind" method, which extend "union" and update slots of "dbconcatKey" and "indexes".
- add @dbconcatKey slot which corresponds to @tblData (has '.0' for numeric columns).
now each SQLDF has @dbconcatKey, which is heavy... But anyway it will have the key cols evaluated when [ subsetting. ?? connect the current slot with the
- ROWNAMES() applies ridx(x)to dbconcatKey(x), good for filter(, condition) where 'condition' has or doesn't have '.0' in the end. But for '[rnm, ]', has to include '.0' to match. Do not encourage.
includeKey slot. SQLDataFrame to keep key columns as fixed and show on the left-hand-side, with | in between to separate key columns with other columns. ncol, and colnames would only correspond to the non-key-columns.@includeKey slot. .wheredbkey()? -- keep for now dimnames, then colnames() would automatically work.dim, then nrow/ncol() would automatically work. extract_tbl_from_SQLDataFrame, .printROWS. [[ extract and realize from the non-key-columns.extractROWS, .extractCOLS_SQLDataFrame. [ subsetting, add to @indexes. only refer to the non-key columns.show method refer to rowRanges(airway) for formatting. [["key", or ["key", to return realized key value? single-key-SQLDataFrame only.@includeKey = TRUE by default.dbkey() returns key column name. wheredbkey() returns the
positions of key column e.g., match(dbkey(),
colnames(SQLDataFrame)). colnames(SQLDataFrame), check if @includeKey.update ncol(x), if !x@includeKey, return nc-1.
"[,SQLDataFrame" always keep key column in @indexes[[2]].
! wheredbkey(x) %in% j: @indexes[[2]] by adding wheredbkey(x).@includeKey as FALSE. [i,j], "j" should correspond to the original col orders.
"show, SQLDataFrame" Always show key column as first column. e.g., "key | original columns...". key column save once, show twice!
show(don't show) key column after | if @includeKey is
TRUE(FALSE). .extract_tbl_from_SQLDataFrame will always have key col in
returned tbl. (no work needed).extract_tbl_rows_by_key, (no work).printROWS, print "key | original columns... ". check
@includeKey(x) to see if print the key column after |.[j] list_style_subsetting returns SQLDataFrame. show doesn't
need change.[, j] single column returns realized by default
drop=TRUE. (no change need)DataFrame, save
tbl_dbi object as one separate slot.show,SQLDataFrame to work as show,DataFrame. Ignore the
existing internal functions from DataFrame, including
extractROWS, as.list, lapply...slots
- add extra slots @indexes in SQLDataFrame to save the row/col
indexes.
- ?? remove the @colnames slot? and keep colnames() accessor?
(consistent with DataFrame)
- renamed the SQLDataFrame@rownames slot into dbrownames.
validity check
- validity check for dbtable() name.
- validity check for the length of @indexes slot.
constructor
- update SQLDataFrame() constructor for @indexes slot.
- update SQLDataFrame() constructor with specified columns and error
message for "col.names" argument.
accessors
- update accessors of nrow(), ncol(), dim(), length(),
colnames(), rownames() to reflect the @indexes slot.
show
- add utility function .extract_tbl_from_SQLDataFrame() to return
tbl_dbi object with row/col filter/selection from @indexes.
- ??? add utility function .extract_tbl_rows_by_key() to extract
certain rows by key. Need to rewrite, by removing the call of
dbkey(x).
- update show,SQLDataFrame to reflect the row/col indexes.
[, [[
- add extractROWS,SQLDataFrame, with both input and output to be
SQLDataFrame object.
- define [[,SQLDataFrame to return SQLDataFrame object. Only for
single column extraction and do realize automatically. (works like
"drop=TRUE")
- define $,SQLDataFrame method, which calls [[,SQLDataFrame.
- define [,SQLDataFrame to return SQLDataFrame object by
adding/updating @indexes slot.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.