knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(dplyr) library(matsbyname) library(matsindf) library(tidyr)

`matsindf_apply()`

is a powerful and versatile function
that enables analysis with lists and data frames by applying
`FUN`

in helpful ways.
The function is called `matsindf_apply()`

,
because it can be used to apply `FUN`

to a `matsindf`

data frame,
a data frame that contains matrices as individual entries in a data frame.
(A `matsindf`

data frame can be created by
calling `collapse_to_matrices()`

, as demonstrated below.)

But `matsindf_apply()`

can apply `FUN`

across much more:
data frames of single numbers,
lists of matrices,
lists of single numbers, and
individual numbers.
This vignette demonstrates `matsindf_apply()`

,
starting with simple examples and
proceeding toward sophisticated analyses.

The basis of all analyses conducted with `matsindf_apply()`

is a function (`FUN`

) to be applied across data
supplied in `.dat`

or `...`

.
`FUN`

must return a named list of variables as
its result.
Here is an example function that both adds and subtracts its arguments,
`a`

and `b`

, and
returns a list containing its result, `c`

and `d`

.

example_fun <- function(a, b){ return(list(c = matsbyname::sum_byname(a, b), d = matsbyname::difference_byname(a, b))) }

Similar to `lapply()`

and its siblings,
additional argument(s) to `matsindf_apply()`

include
the data over which `FUN`

is to be applied.
These arguments can, in the first instance,
be supplied as named arguments to the `...`

argument
of `matsindf_apply()`

.
All arguments in `...`

must be named.
The `...`

arguments to `matsindf_apply()`

are passed to `FUN`

according to their names.
In this case, the output of `matsindf_apply()`

is the the named list returned by `FUN`

.

matsindf_apply(FUN = example_fun, a = 2, b = 1)

Passing an additional argument (`z = 2`

)
causes an unused argument error,
because `example_fun`

does not have a `z`

argument.

tryCatch( matsindf_apply(FUN = example_fun, a = 2, b = 1, z = 2), error = function(e){e} )

Failing to pass a needed argument (`b`

)
causes an error that indicates the missing argument.

tryCatch( matsindf_apply(FUN = example_fun, a = 2), error = function(e){e} )

Alternatively, arguments to `FUN`

can be given
in a named list to `.dat`

, the first argument of `matsindf_apply()`

.
When a value is assigned to `.dat`

,
the return value from `matsindf_apply()`

contains all named variables in `.dat`

(in this case both `a`

and `b`

)
in addition to the results provided by `FUN`

(in this case both `c`

and `d`

).

matsindf_apply(list(a = 2, b = 1), FUN = example_fun)

Extra variables are tolerated in `.dat`

,
because `.dat`

is considered to be a store of data
from which variables can be drawn as needed.

matsindf_apply(list(a = 2, b = 1, z = 42), FUN = example_fun)

In contrast, arguments to `...`

are named explicitly by the user,
so including an extra argument in `...`

is considered an error,
as shown above.

If a named argument is supplied by both `.dat`

and `...`

,
the argument in `...`

takes precedence,
overriding the argument in `.dat`

.

matsindf_apply(list(a = 2, b = 1), FUN = example_fun, a = 10)

When supplying **both** `.dat`

and `...`

,
`...`

can contain named strings of length `1`

which are interpreted as mappings
from named items in `.dat`

to arguments in the signature of `FUN`

.
In the example below,
`a = "z"`

indicates that argument `a`

to `FUN`

should be supplied by item `z`

in `.dat`

.

matsindf_apply(list(a = 2, b = 1, z = 42), FUN = example_fun, a = "z")

If a named argument appears in both `.dat`

and the output of `FUN`

,
a name collision occurs in the output of `matsindf_apply()`

, and
a warning is issued.

tryCatch( matsindf_apply(list(a = 2, b = 1, c = 42), FUN = example_fun), warning = function(w){w} )

`FUN`

can accept more than just numerics.
`example_fun_with_string()`

accepts a character string and a numeric.
However, because `...`

argument that is a character string
of length `1`

has special meaning
(namely mapping variables in `.dat`

to arguments of `FUN`

),
passing a character string of length `1`

can cause an error.
To get around the problem, wrap the single string
in a list, as shown below.

example_fun_with_string <- function(str_a, b) { a <- as.numeric(str_a) list(added = matsbyname::sum_byname(a, b), subtracted = matsbyname::difference_byname(a, b)) } # Causes an error tryCatch( matsindf_apply(FUN = example_fun_with_string, str_a = "1", b = 2), error = function(e){e} ) # To solve the problem, wrap "1" in list(). matsindf_apply(FUN = example_fun_with_string, str_a = list("1"), b = 2) matsindf_apply(FUN = example_fun_with_string, str_a = list("1"), b = list(2)) matsindf_apply(FUN = example_fun_with_string, str_a = list("1", "3"), b = list(2, 4)) matsindf_apply(.dat = list(str_a = list("1"), b = list(2)), FUN = example_fun_with_string) matsindf_apply(.dat = list(m = list("1"), n = list(2)), FUN = example_fun_with_string, str_a = "m", b = "n")

`matsindf_apply()`

and data frames`.dat`

can also contain a data frame (or tibble),
both of which are fancy lists.
When `.dat`

is a data frame or tibble,
the output of `matsindf_apply()`

is a tibble, and
`FUN`

acts like a specialized `dplyr::mutate()`

,
adding new columns at the right of `.dat`

.

matsindf_apply(.dat = data.frame(str_a = c("1", "3"), b = c(2, 4)), FUN = example_fun_with_string) matsindf_apply(.dat = data.frame(str_a = c("1", "3"), b = c(2, 4)), FUN = example_fun_with_string, str_a = "str_a", b = "b") matsindf_apply(.dat = data.frame(m = c("1", "3"), n = c(2, 4)), FUN = example_fun_with_string, str_a = "m", b = "n")

Additional niceties are available when `.dat`

is a data frame or a tibble.
`matsindf_apply()`

works when the data frame is filled with single numeric values,
as is typical.

df <- data.frame(a = 2:4, b = 1:3) matsindf_apply(df, FUN = example_fun)

But `matsindf_apply()`

also works with `matsindf`

data frames,
data frames in which each cell of the data frame is filled with a single matrix.
To demonstrate use of `matsindf_apply()`

with a `matsindf`

data frame,
we'll construct a simple `matsindf`

data frame (`midf`

)
using functions in this package.

# Create a tidy data frame containing data for matrices tidy <- tibble::tibble(Year = rep(c(rep(2017, 4), rep(2018, 4)), 2), matnames = c(rep("U", 8), rep("V", 8)), matvals = c(1:4, 11:14, 21:24, 31:34), rownames = c(rep(c(rep("p1", 2), rep("p2", 2)), 2), rep(c(rep("i1", 2), rep("i2", 2)), 2)), colnames = c(rep(c("i1", "i2"), 4), rep(c("p1", "p2"), 4))) |> dplyr::mutate( rowtypes = case_when( matnames == "U" ~ "Product", matnames == "V" ~ "Industry", TRUE ~ NA_character_ ), coltypes = case_when( matnames == "U" ~ "Industry", matnames == "V" ~ "Product", TRUE ~ NA_character_ ) ) tidy # Convert to a matsindf data frame midf <- tidy |> dplyr::group_by(Year, matnames) |> collapse_to_matrices(rowtypes = "rowtypes", coltypes = "coltypes") |> tidyr::pivot_wider(names_from = "matnames", values_from = "matvals") # Take a look at the midf data frame and some of the matrices it contains. midf midf$U[[1]] midf$V[[1]]

With `midf`

in hand, we can demonstrate use of
`tidyverse`

-style
functional programming to perform
matrix algebra within a data frame.
The functions of the `matsbyname`

package
(such as `difference_byname()`

below)
can be used for this purpose.

result <- midf |> dplyr::mutate( W = difference_byname(transpose_byname(V), U) ) result result$W[[1]] result$W[[2]]

This way of performing matrix calculations works equally well
within a 2-row `matsindf`

data frame
(as shown above) or
within a 1000-row `matsindf`

data frame.

`matsindf_apply()`

Users can write their own functions using `matsindf_apply()`

.
A flexible `calc_W()`

function can be written as follows.

calc_W <- function(.DF = NULL, U = "U", V = "V", W = "W") { # The inner function does all the work. W_func <- function(U_mat, V_mat){ # When we get here, U_mat and V_mat will be single matrices or single numbers, # not a column in a data frame or an item in a list. if (length(U_mat) == 0 & length(V_mat == 0)) { # Tolerate zero-length arguments by returning a zero-length # a list with the correct name and return type. return(list(numeric()) |> magrittr::setnames(W)) } # Calculate W_mat from the inputs U_mat and V_mat. W_mat <- matsbyname::difference_byname( matsbyname::transpose_byname(V_mat), U_mat) # Return a named list. list(W_mat) |> magrittr::set_names(W) } # The body of the main function consists of a call to matsindf_apply # that specifies the inner function in the FUN argument. matsindf_apply(.DF, FUN = W_func, U_mat = U, V_mat = V) }

This style of writing `matsindf_apply()`

functions is incredibly versatile,
leveraging the capabilities of both the `matsindf`

and `matsbyname`

packages.
(Indeed, the `Recca`

package
uses `matsindf_apply()`

heavily and
is built upon the functions in the `matsindf`

and `matsbyname`

packages.)

Functions written like `calc_W()`

can operate in ways similar to `matsindf_apply()`

itself.
To demonstrate, we'll use `calc_W()`

in all the ways that `matsindf_apply()`

can be used,
going in the reverse order to our demonstration of the capabilities of `matsindf_apply()`

above.

`calc_W()`

can be used as a specialized `mutate`

function
that operates on `matsindf`

data frames.

midf |> calc_W()

The added column could be given a different name from the default ("`W`

")
using the `W`

argument.

midf |> calc_W(W = "W_prime")

As with `matsindf_apply()`

,
column names in `midf`

can be mapped to the arguments of `calc_W()`

by the arguments to `calc_W()`

.

midf |> dplyr::rename(X = U, Y = V) |> calc_W(U = "X", V = "Y")

`calc_W()`

can operate on lists of single matrices, too.
This approach works, because the default values for the
`U`

and `V`

arguments to `calc_W()`

are
"U" and "V", respectively.
The input list members (in this case `midf$U[[1]]`

and `midf$V[[1]]`

)
are returned with the output, because
`list(U = midf$U[[1]], V = midf$V[[1]])`

is passed to the `.dat`

argument
of `matsindf_apply()`

.

calc_W(list(U = midf$U[[1]], V = midf$V[[1]]))

It may be clearer to name the arguments as required by the `calc_W()`

function
without wrapping in a list first,
as shown below.
But in this approach, the input matrices are not returned with the output,
because arguments `U`

and `V`

are passed to the `...`

argument of `matsindf_apply()`

,
not the `.dat`

argument of `matsindf_apply()`

.

calc_W(U = midf$U[[1]], V = midf$V[[1]])

`calc_W()`

can operate on data frames containing single numbers.

data.frame(U = c(1, 2), V = c(3, 4)) |> calc_W()

Finally, `calc_W()`

can be applied to single numbers,
and the result is 1x1 matrix.

calc_W(U = 2, V = 3)

It is good practice to write internal functions
that tolerate zero-length inputs, as `calc_W()`

does.
Doing so, enables results from different calculations to be `rbind`

ed together.

calc_W(U = numeric(), V = numeric()) calc_W(list(U = numeric(), V = numeric())) res <- calc_W(list(U = c(2, 3, 4, 5), V = c(3, 4, 5, 6))) res0 <- calc_W(list(U = numeric(), V = numeric())) dplyr::bind_rows(res, res0)

This vignette demonstrated use of
the versatile `matsindf_apply()`

function.
Inputs to `matsindf_apply()`

can be

- single numbers,
- matrices, or
- data frames with appropriately-named columns.

`matsindf_apply()`

can be used for programming, and
functions constructed as demonstrated above
share characteristics with `matsindf_apply()`

:

- they can be used as specialized
`dplyr::mutate()`

operators, and - they can be applied to single numbers, matrices, or data frames with appropriately-named columns.

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.