knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(dplyr) library(tidyr) library(matsbyname) library(matsindf)
Matrices are important mathematical objects, and they often describe networks of flows among nodes. Example networks are given in the following table.
| System type | Flows | Nodes |:--------------|:-----------|:--------- | Ecological | nutrients | organisms | Manufacturing | materials | factories | Economic | money | economic sectors | Energy | energy carriers | energy conversion steps
The power of matrices lies in their ability
to organize network-wide calculations,
thereby simplifying the work of analysts who study entire systems.
However, three problems arise when performing matrix operations
in R
and other languages.
Although built-in matrix functions
ensure size conformity of matrix operands,
they do not respect the names of rows and columns
(known as dimnames
in R
).
In the following example,
U represents a use matrix
that contains the quantity of each product used by each industry,
and
Y represents a final demand matrix
that contains the quantity of each product consumed by final demand industries.
If the rows and columns are not in the same order,
the sum of the matrices is nonsensical.
productnames <- c("p1", "p2") industrynames <- c("i1", "i2") U <- matrix(1:4, ncol = 2, dimnames = list(productnames, industrynames)) U Y <- matrix(1:4, ncol = 2, dimnames = list(rev(productnames), rev(industrynames))) Y # This sum is nonsensical. Neither row nor column names are respected. U + Y
As a result, analysts performing matrix operations must maintain strict order of rows and columns across all calculations.
# Make a new version of Y (Y2), this time with dimnames in same order as U Y2 <- matrix(4:1, ncol = 2, dimnames = list(productnames, industrynames)) Y2 # Now the sum is sensible. Both row and column names are respected. U + Y2
In many cases, operand matrices may have different numbers or different names of rows or columns. This situation can occur when, for example, products or industries changes across time periods. When performing matrix operations, rows or columns of zeros must be added to ensure name conformity.
Y3 <- matrix(5:8, ncol = 2, dimnames = list(c("p1", "p3"), c("i1", "i3"))) Y3 # Nonsensical because neither row names nor column names are respected. # The "p3" rows and "i3" columns of Y3 have been added to # "p2" rows and "i2" columns of U. # Row and column names for the sum are taken from the first operand (U). U + Y3 # Rather, need to insert missing rows in both U and Y before summing. U_2000 <- matrix(c(1, 3, 0, 2, 4, 0, 0, 0, 0), ncol = 3, byrow = TRUE, dimnames = list(c("p1", "p2", "p3"), c("i1", "i2", "i3"))) Y_2000 <- matrix(c(5, 0, 7, 0, 0, 0, 6, 0, 8), ncol = 3, byrow = TRUE, dimnames = list(c("p1", "p2", "p3"), c("i1", "i2", "i3"))) U_2000 Y_2000 U_2000 + Y_2000
The analyst's burden is cumbersome. But worse problems await.
Respecting names (and adding rows and columns of zeroes) can lead to an inability to invert matrices downstream, as shown in the following example.
# The original U matrix is invertible. solve(U) # The version of U that contains zero rows and columns (U_2000) # is singular and cannot be inverted. tryCatch(solve(U_2000), error = function(err){print(err)})
Matrix functions provided by R
and other languages
do not ensure type conformity for matrix operands
to matrix algebra functions.
In the example of matrix multiplication,
columns of the multiplicand
must contain the same type of information as the
as the rows of the multiplier.
If the columns of A are countries,
then the rows of B must also be countries
(and in the same order)
if A %*%
B is to make sense.
matsbyname
The matsbyname
package automatically addresses all three problems above.
It performs smart matrix operations that
These features are available without analyst intervention, as shown in the following example.
# Same as U + Y2, without needing to create Y2. sum_byname(U, Y) # Same as U_2000 + Y_2000, but U and Y3 are unmodified. sum_byname(U, Y3) # Eliminate zero-filled rows and columns. Same result as solve(U). U_2000 %>% clean_byname(margin = c(1,2), clean_value = 0) %>% solve()
In addition to sum_byname()
and clean_byname()
,
the matsbyname
package contains many additional
matrix algebra functions
that respect the names of rows and columns.
Commonly-used functions are:
sum_byname()
difference_byname()
hadamardproduct_byname()
matrixproduct_byname()
quotient_byname()
rowsums_byname()
colsums_byname()
invert_byname()
, and transpose_byname()
.The full list of functions can be found with ?matsbyname
and clicking the Index
link.
Furthermore, matsbyname
works well with its sister package, matsindf
.
matsindf
creates data frames
whose entries are not numbers but entire matrices, thereby
enabling the use of matsbyname
functions in
tidyverse
functional programming.
When used together, matsbyname
and matsindf
allow analysts to wield simultaneously the power of both
matrix mathematics
and
tidyverse
functional programming.
This vignette demonstrates the power of matrix mathematics
performed byname
.
matsbyname
featuresThe matsbyname
package has several features
that both simplify analyses and ensure their correctness.
In the preceding examples, row and column names were provided by the
dimnames
argument to the matrix()
function.
However, matsbyname
provides the
setcolnames_byname()
and setrownames_byname()
functions to perform the same tasks
using the pipe operator (%>%
or |>
).
U_2 <- matrix(1:4, ncol = 2) %>% setrownames_byname(productnames) %>% setcolnames_byname(industrynames) U_2
Row and column types can be understood by analogy:
row and column types are to matrices in matrix algebra as
units are to scalars in scalar algebra.
Just as careful tracking of units is necessary in scalar calculations,
careful tracking of row and column types is necessary in matrix operations.
Because matsbyname
keeps track of row and column types automatically,
much of the burden of dealing with row and column types is removed from the analyst.
Row and column types are character strings stored
as attributes of the matrix object, and
matsbyname
functions ensure correctness
of matrix operations by checking
row and column types,
throwing errors if needed.
Row and column types can be set by the functions
setrowtype()
and setcoltype()
and
retrieved by the functions
rowtype()
and coltype()
.
Consider matrices A, B, and C:
A <- matrix(1:4, ncol = 2) %>% setrownames_byname(productnames) %>% setcolnames_byname(industrynames) %>% setrowtype("Products") %>% setcoltype("Industries") A B <- matrix(8:5, ncol = 2) %>% setrownames_byname(productnames) %>% setcolnames_byname(industrynames) %>% setrowtype("Products") %>% setcoltype("Industries") B C <- matrix(1:4, ncol = 2) %>% setrownames_byname(industrynames) %>% setcolnames_byname(productnames) %>% setrowtype("Industries") %>% setcoltype("Products") C
B can be added to A, because row and column types are identical.
sum_byname(A, B)
However, C cannot be added to A (or B), because row and column types disagree.
tryCatch(sum_byname(A, C), error = function(err){print(err)})
In this case, a sum is possible if C is transposed prior to adding to A, because row and column types of A and C^T^ agree.
sum_byname(A, transpose_byname(C))
Matrices A and B can be element-multiplied and element-divided for the same reason they can be summed: row and column types agree.
hadamardproduct_byname(A, B) quotient_byname(A, B)
Note that A and C can be matrix-multiplied,
because the column type of A and the row type of C
are identical (Industries
).
The result is a Products
-by-Products
matrix.
matrixproduct_byname(A, C)
However, A and B cannot be matrix-multiplied, because
the column type of A (Industries
) and
the row type of B (Products
)
are different.
tryCatch(matrixproduct_byname(A, B), error = function(err){print(err)})
Analysts are encouraged
to set row and column types on matrices,
thereby taking advantage of matsbyname
's
type-tracking feature
to improve their matrix-based analyses.
*_byname
functions work well with listsAnother feature of the matsbyname
package
is that it works
when arguments to functions are lists of matrices,
returning lists as appropriate.
sum_byname(A, list(B, B)) hadamardproduct_byname(list(A, A), B) matrixproduct_byname(list(A, A), list(C, C))
matsbyname
works well with matsindf
The matsindf
package provides functions that collapse
tidy data frames
of matrix elements into data frames of matrices.
Data frames of matrices,
such as those created by matsindf
,
are like magic spreadsheets
in which single cells contain entire matrices.
The following example demonstrates an approach to creating a data frame of matrices.
tidy <- data.frame( matrix = c("A", "A", "A", "A", "B", "B", "B", "B"), row = c("p1", "p1", "p2", "p2", "p1", "p1", "p2", "p2"), col = c("i1", "i2", "i1", "i2", "i1", "i2", "i1", "i2"), vals = c(1, 3, 2, 4, 8, 6, 7, 5) ) %>% mutate( rowtype = "Industries", coltype = "Products" ) tidy mats <- tidy %>% group_by(matrix) %>% matsindf::collapse_to_matrices(matnames = "matrix", matvals = "vals", rownames = "row", colnames = "col", rowtypes = "rowtype", coltypes = "coltype") %>% rename( matrix.name = matrix, matrix = vals ) mats mats$matrix[[1]] mats$matrix[[2]]
matsbyname
with matsindf
Because
matsbyname
works well with lists, andmatsindf
package can create data frames of matrices, andmatsbyname
functions can be used with
tidyr and
dplyr functions
(such as
spread
and mutate
)
to perform matrix algebra within data frames
of matrices.
A single matsbyname
instruction performs the same operation
on all rows of a matsindf
data frame.
Loops begone!
result <- mats %>% tidyr::spread(key = matrix.name, value = matrix) %>% # Duplicate the row to demonstrate byname operating simultaneously # on all rows of the data frame. dplyr::bind_rows(., .) %>% dplyr::mutate( # Create a column of constants. c = RCLabels::make_list(x = 1:2, n = 2, lenx = 2), # Sum all rows of the data frame with a single instruction. sum = sum_byname(A, B), # Multiply matrices in the sum column by corresponding constants in the c column. product = hadamardproduct_byname(c, sum) ) result result$sum[[1]] result$sum[[2]] result$product[[1]] result$product[[2]]
matsbyname
and matsindf
A suggested analysis workflow is as follows:
tidy
above.matsindf::collapse_to_matrices()
to create a data frame of matrices with columns for matrix names and
matrices themselves,
similar to mats
above.tidyr::pivot_wider()
the matrices to obtain a data frame with columns for each matrix,
similar to result
above.*_byname
functions.tidyr::pivot_longer()
the columns into a data frame with a single column of matrices.matsindf::expand_to_tidy()
to create a tidy data frame of matrix elements.tidyr::pivot_wider()
the data as necessary.For more information and a detailed example of this workflow,
see the vignette for the matsindf
package.
The matsbyname
package simplifies analyses
in which row and column names ought to be respected.
It provides optional row and column types, thereby
ensuring that only valid matrix operations are performed.
Finally, matsbyname
functions work equally well with lists
to allow use of *_byname
functions with
tidyr
and
dplyr
approaches to manipulating data.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.