iface: Construct an interface specification
In interfacer: Define and Enforce Contracts for Dataframes as Function Parameters

iface

R Documentation

Construct an interface specification

Description

An iface specification defines the expected structure of a dataframe, in terms of the column names, column types, grouping structure and uniqueness constraints that the dataframe must conform to. A dataframe can be tested for conformance to an iface specification using ivalidate().

Usage

iface(..., .groups = NULL, .default = NULL)

Arguments

`...`	The specification of the interface (see details), or an unnamed `iface` object to extend, or both.
`.groups`	either `FALSE` for no grouping allowed or a formula of the form `~ var1 + var2 + ...` which defines what columns must be grouped in the dataframe (and in which order). If `NULL` (the default) then any grouping is permitted. If the formula contains a dot e.g. `~ . + var1 + var2` then the grouping must include `var1` and `var2` but other groups are also allowed.
`.default`	a default value to supply if there is nothing given in a function parameter using the `iface` as a formal. This is either `NULL` in which case there is no default, `TRUE` in which case the default is a zero row dataframe conforming to the specification, or a provided dataframe, which is checked to conform, and used as the default.

Details

An iface specification is designed to be used to define the type of a parameter in a function. This is done by using the iface specification as the default value of the parameter in the function definition. The definition can then be validated at runtime by a call to ivalidate() inside the function.

When developing a function output an iface specification may also be used in ireturn() to enforce that the output of a function is correct.

iface definitions can be printed and included in roxygen2 documentation and help us to document input dataframe parameters and dataframe return values in a standardised way by using the ⁠@iparam⁠ roxygen2 tag.

iface specifications are defined in the form of a named list of formulae with the structure column_name = type ~ "documentation".

type can be one of anything, character, complete, date, default, double, enum, factor, finite, group_unique, in_range, integer, logical, not_missing, numeric, of_type, positive_double, positive_integer, proportion, unique_id (e.g. enum(level1,level2,...), in_range(min,max)) or alternatively anything that resolves to a function e.g. as.ordered.

If type is a function name, then the function must take a single vector parameter and return a single vector of the same size. The function must also return a zero length vector of an appropriate type if passed NULL.

type can also be a concatenation of rules separated by +, e.g. integer + group_unique for an integer that is unique within a group.

Value

the definition of an interface as a iface object

Examples


test_df = tibble::tibble(
  grp = c(rep("a",10),rep("b",10)), 
  col1 = c(1:10,1:10)
) %>% dplyr::group_by(grp)

my_iface = iface( 
  col1 = integer + group_unique ~ "an integer column",
  .default = test_df
)

print(my_iface)

# the function x defines a formal `df` with default value of `my_iface`
# this default value is used to validate the structure of the user supplied
# value when the function is called.
x = function(df = my_iface, ...) {
  df = ivalidate(df,...)
  return(df)
}

# this works
x(tibble::tibble(col1 = c(1,2,3)))

# this fails as x is of the wrong type
try(x(tibble::tibble(col1 = c("a","b","c"))))

# this fails as x has duplicates
try(x(tibble::tibble(col1 = c(1,2,3,3))))

# this gives the default value
x()


my_iface2 = iface(
  first_col = numeric ~ "column order example",
  my_iface, 
  last_col = character ~ "another col", .groups = ~ first_col + col1
)
print(my_iface2)



my_iface_3 = iface( 
  col1 = integer + group_unique ~ "an integer column",
  .default = test_df_2
)
x = function(d = my_iface_3) {ivalidate(d)}

# Doesn't work as test_df_2 hasn't been defined
try(x())

test_df_2 = tibble::tibble(
  grp = c(rep("a",10),rep("b",10)), 
  col1 = c(1:10,1:10)
) %>% dplyr::group_by(grp)

# now it works as has been defined
x()

# it still works as default has been cached.
rm(test_df_2)
x()

interfacer documentation built on April 4, 2025, 6:13 a.m.