knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(canny.tudor)
Can you find the type of a tibble::tibble
, meaning the types plural; that is,
one for each of the table columns? If you ask for base::typeof
you only see
"list"
.
Two columns: one of int
, the other of chr
.
tibble::tibble(a = 1:10, b=letters[1:10])
You can access the column names using base::names
and then apply the resulting
names to the table using the [[
sub-setting operator, turning each column to
its underlying vector each in turn. Tibbles recommend using [[
in newer code.
Finally base:typeof
tells you the vector type.
types.of <- function(x) sapply(names(x), function(y) typeof(x[[y]]))
It answers a named vector of character describing the type of each column paired with the name of the column.
library(canny.tudor) types.of(tibble::tibble(a=1:10, b=letters[1:10]))
Does the same approach work on non-tibble data frames? It does because it only relies on subset access operators.
df <- data.frame(x = 1.1, y = I(letters[1:10])) canny.tudor::types.of(df)
And finally, what about data frames lazily-loaded from a database? Do the underlying column types manifest themselves? And do they appear before or after collection of data by query execution? The simple answer is no, not without first collecting some rows. This includes no rows, or limiting rows to zero.
You might expect this result since the data frame does not yet exist, strictly speaking; only the promise exists until some SQL runs.
First create an SQLite database and set up a table, called df
short for data
frame. Fill it with some test data.
DBI::dbConnect(RSQLite::SQLite()) -> db DBI::dbWriteTable(db, "df", data.frame(a=1:10, b=letters[1:10])) dplyr::tbl(db, "df") dplyr::tbl(db, "df") %>% types.of()
Above uses R 4.1+ built-in pipe operator; also uses dplyr
to read back the
table lazily and derive the types. No joy. You cannot access lazy table columns
as you would an in-memory data frame.
Column subset answers NULL
because the frame is not yet available locally.
dplyr::tbl(db, "df")[["a"]] sapply(dplyr::tbl(db, "df") %>% class(), function(x) methods(class = x))
However, you can effectively select no rows to derive an empty data frame carrying empty column vectors but one empty vector for each column and with the appropriate column type.
dplyr::tbl(db, "df") %>% head(0) %>% dplyr::collect() %>% canny.tudor::types.of() DBI::dbDisconnect(db)
Joy!
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.