R6Frame provides a R6 "frame" around a data.frame
(or data.table
), which allows one to create more complex objects/operations based on the underlying data. R6Frame is meant to be used as a template for R6 subclasses, and comes with S3 methods for most data.frame
operations included.
As an example, R6Frames can be used to handle labels for variables in a data.frame
. This allows us to preserve the labels more easily than using a label
attribute for each variable.
To subclass R6Frame, we can simply specify it under inherit
. In the code below, we create a minimal subclass with a private field called .label
and public getter/setter methods for the field. Because we want to be able to for instance rbind
and preserve labels, we also need a merge_vectors
function which prioritizes the label for the first R6Frame in e.g. rbind
:
# Subclass "Labelled_df" for R6Frame. library(R6Frame) is.labelled_df <- function(x) inherits(x, "Labelled_df") Labelled_df <- R6::R6Class("Labelled_df", inherit = R6Frame::R6Frame, # Private methods private = list(.label = NULL), # Public methods public = list( set_label = function(..., list = NULL) { # Set label for a variable. new <- merge_vectors(..., list, private$.label, default = self$names()) private$.label <- new invisible(self) }, get_label = function(which = NULL) { # Get label for a specified variable. (NULL = all labels.) res <- private$.label if (!is.null(res) && !is.null(which)) { res <- res[match_all(which, names(res))] if (!length(res)) res <- NULL } res } ) ) # Utility function that merges named vectors for private field (.label) in Labelled_df. # Duplicates are dropped from the end of the named vector (after unlisting). merge_vectors <- function(..., default = NULL) { dots <- list(...) if (!length(dots)) stop("No vectors supplied.") # Use unnamed default's as names for a vector of NA's. # (So we can pass colummnames as a default.) if (!is.null(default) && is.null(names(default))) { default <- setNames(rep(NA, length(default)), default) } res <- c(unlist(dots), default) nms <- names(res) if (is.null(nms) && any(is.na(nms)) && any(nms == "")) stop("All elements must be named.") # Return should be ordered by default if it exists res <- res[!duplicated(names(res), fromLast = FALSE)] if (!is.null(default)) { res[names(default)] } else { res } }
R6Frame$do()
Each function call on a R6Frame is sent to either $do()
or $do_merge()
(all binds and joins), which can be modified to update the .label
field everytime we perform an operation on the data. For this example, we need to update $do_merge()
for our labels to follow if we merge or join two or more Labelled_df
:
Labelled_df$set( "public", "do_merge", function(f, dots, env) { # Get existing labels from data lab <- lapply(dots, function(x) { if (is.labelled_df(x)) x$get_label() }) # Assign to private field (self$do and in turn self$update will remove duplicates) private$.label <- merge_vectors(private$.label, lab) # Call the default R6Frame method "do_merge" to complete the operation. super$do_merge(f, dots, env) } )
R6Frame$update()
Whenever the result of an operation is a new (or the same) data.frame
, R6Frame will call $update()
on the result. In order to drop labels as variables are removed for instance, or add a "slot" for a new variable, we have to change $update()
as follows:
Labelled_df$set( "public", "update", function(renamed = NULL) { # If renamed is not NULL, it is a vector with the same length as the data # and contains the new names for the variables. (Usually from dplyr methods.) if (!is.null(renamed)) { private$.label <- setNames(private$.label, renamed) } self$set_label() self } )
In the code below, we can see usage examples of our new Labelled_df
class and how the labels are kept updated through various function calls.
org <- data.frame("A" = c("Yes","No"), "B" = c(1, 2), stringsAsFactors = FALSE) # Initialize df <- Labelled_df$new(org) # Set label df$set_label(A = "Yes/No", B = "Numbers") df$get_label() # Add a column df[, "test"] <- "test" # Split and set label df2 <- df[, "test", drop = FALSE] df[["test"]] <- NULL df2$set_label(test = "Test label") # Join again df3 <- cbind(df2, df) df3$get_label()
To see more examples, check out the code at: https://github.com/itsdalmo/reporttoolDT
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.