knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Data validation rules control what constants can be entered into a cell, e.g. any whole number between 0 and 9, or one of several values from another part of the spreadsheet.
‘xlsx_validation()’ returns each of the data validation rules in an xlsx file, and the ranges of cells to which each rule applies.
Here is a rule that restricts input to integers between 0 and 9 inclusive, or no value (blank). If any other value is attempted, then an error message is displayed with the imaginative title "message title", the informative body text "message body", and a "stop" symbol.
library(tidyxl) library(dplyr) library(tidyr) examples <- system.file("extdata/examples.xlsx", package = "tidyxl") glimpse(xlsx_validation(examples)[1, ])
The gamut of possible rules is given in the examples for xlsx_validation()
.
as.data.frame(xlsx_validation(examples))
There are no built-in functions for joining ranges like
A1:D5,G8
to single cells likeB3
. For now, use the snippets in this section. In future I might develop a dplyr-like join function (this is hard currently because dplyr doesn't yet join on arbitrary functions, or even the standard inequalities like>=
). Help and advice would be gratefully accepted!
To join rules to cells, a naive method is to use the sheet
and ref
columns
to match the sheet
and address
columns to the output of xlsx_cells()
.
rules <- xlsx_validation(examples) cells <- filter(xlsx_cells(examples), row >= 106, col == 1) rules cells inner_join(rules, cells, by = c("sheet" = "sheet", "ref" = "address"))
Notice that only 9 cells were joined, even though 15 rules were defined. Surely
at least 15 cells ought to be joined? The reason why they are not is that the
cells for the other 6 rules don't exist -- rules can be defined for cells that
have no value, and cells with no value are not returned by xlsx_cells()
,
otherwise all r sprintf("%11.0f", 16384 * 1048576)
cells in a worksheet must
be returned.
A more subtle reason for certain cells not to have joined successfully is that
the ref
column of the rules sometimes refers to more than one cell, and can
even refer to several, non-contiguous ranges of cells. Specifically, the
seventh rule's ref
column has r rules$ref[7]
.
Special treatment is needed here. Ideally, some kind of join function would be
defined that can compare indidual cells with ranges. But I haven't written one,
so what follows is a workaround. First, the two ranges of cells must be
unnested
into A115
and A121:122
. Then the range A121:122
must be
'unranged' into A121
and A122
.
unrange <- function(x) { limits <- cellranger::as.cell_limits(x) rows <- seq(limits$ul[1], limits$lr[1]) cols <- seq(limits$ul[2], limits$lr[2]) rowcol <- expand.grid(rows, cols) cell_addrs <- cellranger::cell_addr(rowcol[[1]], rowcol[[2]]) cellranger::to_string(cell_addrs, fo = "A1", strict = FALSE) } unnest_ref <- function(x, ref) { UseMethod("unnest_ref") } unnest_ref.default <- function(x, ref_col = ref) { stopifnot(is.character(x), length(x) == 1L) refs <- unlist(strsplit(x, ",", fixed = TRUE)) unlist(lapply(refs, unrange)) } unrange("A121:A122") unnest_ref("A115,A121:A122")
The unnest_ref()
function can also be defined for whole data frames, unnesting
them by a column of references.
unnest_ref.data.frame <- function(x, ref_col) { ref <- rlang::enquo(ref_col) x[[rlang::quo_name(ref)]] <- lapply(x[[rlang::quo_name(ref)]], unnest_ref) tidyr::unnest(x, rlang::UQ(ref)) } (nested_rule <- slice(rules, 7)) unnest_ref(nested_rule, ref)
Finally the new data frame of rules can be joined to a data frame of cells in
any of the usual ways, via the sheet
and ref
columns.
Problems with this approach occur with rules that are defined over large ranges of cells: the 'unnesting' of those ranges results in very long vectors of individual cell addresses, or (worse) huge data frames of rules. Such cases are commonplace, because rules are often defined for entire columns of a spreadsheet, and a column has 1048576 rows.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.