# Use Cases and Examples for matsindf In matsindf: Matrices in Data Frames

EnergyMats_2000$matrix[[3]] # The Y matrix  ### Duplicate (for purposes of illustration) Larger studies will include data for multiple countries and years. The ECC data from UK in year 2000 can be duplicated for 2001 and for a fictitious country AB. Although the data are unchanged, the additional rows serve to illustrate the functional programming aspects of the matsindf and matsbyname packages. Energy <- EnergyMats_2000 %>% # Create rows for a fictitious country "AB". # Although the rows for "AB" are same as the "GB" rows, # they serve to illustrate functional programming with matsindf. rbind(EnergyMats_2000 %>% mutate(Country = "AB")) %>% spread(key = Year, value = matrix) %>% mutate( # Create a column for a second year (2001). 2001 = 2000 ) %>% gather(key = Year, value = matrix, 2000, 2001) %>% # Now spread to put each matrix in a column. spread(key = matrix.name, value = matrix) glimpse(Energy)  ### Verify data An important step in any analysis is data verification. For an ECC analysis, it is important to verify that energy is conserved (i.e., energy is in balance) across all industries. Equations 1 and 2 in Heun, Owen, and Brockway [-@Heun:2018] show that energy balance is verified by $$\mat{W} = \transpose{\mat{V}} - \mat{U},$$ and $$\mat{W}\colvec{i} - \mat{Y}\colvec{i} = \colvec{0}.$$ Energy balance verification can be implemented with matsbyname functions and tidyverse functional programming: Check <- Energy %>% mutate( W = difference_byname(transpose_byname(V), U), # Need to change column name and type on y so it can be subtracted from row sums of W err = difference_byname(rowsums_byname(W), rowsums_byname(Y) %>% setcolnames_byname("Industry") %>% setcoltype("Industry")), EBalOK = iszero_byname(err) ) Check %>% select(Country, Year, EBalOK) all(Check$EBalOK %>% as.logical())


This example demonstrates that energy balance can be verified for all combinations of Country and Year with a few lines of code. In fact, the exact same code can be applied to the Energy data frame, regardless of the number of rows in it.

Secure in the knowledge that energy is conserved across all ECCs in the Energy data frame, other analyses can proceed.

### Efficiencies

To further illustrate the power of matsbyname functions in the context of matsindf, consider the calculation of the efficiency of every industry in the ECC as column vector $\eta$ as shown by Equation 11 of Heun, Owen, and Brockway [-@Heun:2018].

$$\colvec{g} = \mat{V}\colvec{i}$$

$$\colvec{\eta} = \hatinv{\transpose{\mat{U}} \colvec{i}} \colvec{g}$$

Etas <- Energy %>%
mutate(
g = rowsums_byname(V),
eta = transpose_byname(U) %>% rowsums_byname() %>%
hatize_byname(keep = "rownames") %>% invert_byname() %>%
matrixproduct_byname(g) %>%
setcolnames_byname("eta") %>% setcoltype("Efficiency")
) %>%
select(Country, Year, eta)

Etas\$eta[[1]]


Note that only a few lines of code are required to perform the same series of matrix operations on every combination of Country and Year. In fact, the same code will be used to calculate the efficiency of any number of industries in any number of countries and years!

### Expand

Plotting values from a matsindf data frame can be accomplished by expanding the matrices of the matsindf data frame (in this example, Etas) back out to a tidy data frame. Expanding is the reverse of collapse-ing, and the following information must be supplied to the expand_to_tidy function:

| argument to expand_to_tidy | identifies |--------------------------------------------:|:-------------------------------- | matnames | Name of the input column of matrix names
| matvals | Name of the input column of matrices to be expanded | rownames | Name of the output column of matrix row names
| colnames | Name of the output column of matrix column name
| rowtypes | Optional name of the output column of matrix row types
| coltypes | Optional name of the output column of matrix column types
| drop | Optional value to be dropped from output (often 0)

Prior to expanding, it is usually necessary to gather columns of matrices.

etas_forgraphing <- Etas %>%
gather(key = matrix.names, value = matrix, eta) %>%
expand_to_tidy(matnames = "matrix.names", matvals = "matrix",
rownames = "Industry", colnames = "etas",
rowtypes = "rowtype", coltypes = "Efficiencies") %>%
mutate(
# Eliminate columns we no longer need.
matrix.names = NULL,
etas = NULL,
rowtype = NULL,
Efficiencies = NULL
) %>%
rename(
eta = matrix
)

# Compare to Figure 8 of Heun, Owen, and Brockway (2018)
etas_forgraphing %>% filter(Country == "GB", Year == 2000)


etas_forgraphing is a data frame of efficiencies, one for each Country, Year, and Industry, in a format that is amenable to plotting with packages such as ggplot.

### Report

The following code creates a bar graph of efficiency results for the UK in 2000:

etas_UK_2000 <- etas_forgraphing %>% filter(Country == "GB", Year == 2000)

etas_UK_2000 %>%
ggplot(mapping = aes_string(x = "Industry", y = "eta",
fill = "Industry", colour = "Industry")) +
geom_bar(stat = "identity") +
labs(x = NULL, y = expression(eta[UK*","*2000]), fill = NULL) +
scale_y_continuous(breaks = seq(0, 1, by = 0.2)) +
scale_fill_manual(values = rep("white", nrow(etas_UK_2000))) +
scale_colour_manual(values = rep("gray20", nrow(etas_UK_2000))) +
guides(fill = FALSE, colour = FALSE) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.4, hjust = 1))


## Conclusion

This vignette demonstrated the use of the matsindf and matsbyname packages and suggested a workflow to accomplish sophisticated analyses using matrices in data frames (matsindf).

The workflow is as follows:

• Reshape data into a tidy data frame with columns for matrix name, element value, row name, column name, row type, and column type, similar to UKEnergy2000 above.
• Use collapse_to_matrices to create a data frame of matrices with columns for matrix names and matrices themselves, similar to EnergyMats_2000 above.
• tidyr::spread the matrices to obtain a data frame with columns for each matrix, similar to Energy above.
• Validate the data, similar to Check above.
• Perform matrix algebra operations on the columns of matrices using matsbyname functions in a manner similar to the process of generating the Etas data frame above.
• tidyr::gather the columns to obtain a tidy data frame of matrices.
• Use expand_to_tidy to create a tidy data frame of matrix elements, similar to etas_forgraphing above.
• Plot and report results as demonstrated by the graph above.

Data frames of matrices, such as those created by matsindf, are like magic spreadsheets in which single cells contain entire matrices. With this data structure, analysts can wield simultaneously the power of both matrix mathematics and tidyverse functional programming.

## Try the matsindf package in your browser

Any scripts or data that you put into this service are public.

matsindf documentation built on Aug. 18, 2023, 5:06 p.m.