knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The innermap
package presents the innermap
family functions. These in turn, allow the use of map family function but using elements of the same vectors. This is not the same thing of purrr::accumulate
function, since it does not uses any output made by the function in the process. Now, we also present 2 bimap
(bilinear map) functions, that extend the same functionality of purrr::map_depth
for map2
.
You can install the development version from GitHub with:
# install.packages("devtools") devtools::install_github("MarceloRTonon/innermap")
library(innermap) library(purrr)
Bellow we offer a simple example. Important to notice that when dealing with atomic vectors one could use apply
instead of purrr
functions, however, the latter offer a greater flexebility with the inputs, specially when dealing with nested lists which the last elements have matrix
class.
inputList <- c(1:30) |> as.list() |> purrr::map(rep, 9) |> purrr::map(matrix, nrow = 3, ncol =3)
Now suppose you want to sum every matrix of the inputLest
with the following one. Using base R
, a intuitive way of doing this is through for
(we discuss in other possibilities using base R
):
#1: For Method outputListFor <- list() for(i in 1:(length(inputList)-1)){ outputListFor[[i]] <- inputList[[i]] + inputList[[i+1]] }
Howver using for
in R is neither very optimized or a good practice. In this sense, the innermap
function family produce strong elements
outputList <- innermap::innermap(inputList, function(x,y) x+y) all.equal(outputList, outputListFor)
The function innermap::innermap
is built upon the package
In this case, we used each element with the following one, but we can also use it with different distances, using the argument distance
. Supposing we want to multiply the n
element with the n+2
element we can do:
outputList <- innermap::innermap(inputList, distance =2, function(x,y) x+y)
The default value for the distance
argument 1.
We could however use lapply
, using two methods. The first is creating a list containing indexes of .x
and .y
:
# 2_a: Lapply Index Method outputList <- list(.x = as.list(c(1:(length(inputList)-1))), .y = as.list(c(2:length(inputList)))) |> purrr::transpose() |> lapply(FUN = function(.index) inputList[[.index$.x]]+inputList[[.index$.y]])
Another way is to attribute directly the variables in the list:
# 2_b: Lapply Subset Method outputList <- list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> lapply(FUN = function(.input) .input$.x+ .input$.y)
It is possible to do this using the purrr
package by, at least, four ways: 2 using map
and 2 using map2
. As for lapply
, we can use an Index Method, where the indexes are given as .x
and .y
parameters for map
and map2
, and the input vector is given inside the function in the .f
argument:
#3_a Purrr::map Index Method outputList <- list(as.list(c(1:(length(inputList)-1))), as.list(c(2:length(inputList)))) |> purrr::transpose() |> map(.f = function(x) inputList[[x[[1]]]]+ inputList[[x[[2]]]]) #4_a: Purrr::map2 Index Method outputList <- purrr::map2(.x = c(1:(length(inputList)-1)), .y = c(2:length(inputList)), .f = function(x,y) inputList[[x]] + inputList[[y]])
The second can be through subsetting the
#3_b: purrr::map Subset Method with purrr::reduce outputList <- list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> map(.f = reduce, `+`) #4_b: Purrr::map2 Subset Method outputList <- purrr::map2(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))], .f = `+`)
To say the truth when using map
, there is another choice to be made: using purrr::reduce
or not. There is consequence in using it, as we shall discuss latter. Bellow we present a tabble using the microbenchmark
package. The code for the benchmark can be seen in the end of this document
library(microbenchmark) bench <- microbenchmark::microbenchmark( "1_For" = {outputList <- list() for(i in 1:(length(inputList)-1)){ outputList[[i]] <- inputList[[i]] + inputList[[i+1]] }}, "2a_lapplyIndex" = {list(.x = as.list(c(1:(length(inputList)-1))), .y = as.list(c(2:length(inputList)))) |> purrr::transpose() |> lapply(FUN = function(.index) inputList[[.index$.x]]+inputList[[.index$.y]])}, "2a2_lapplyIndex" = {list(.x = as.list(c(1:(length(inputList)-1))), .y = as.list(c(2:length(inputList)))) |> purrr::transpose() |> lapply(FUN = function(.index, .f0 = `+`, ...) .f0(inputList[[.index$.x]],inputList[[.index$.y]],...))}, "2b_lapplySubset" = { list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> lapply(FUN = function(.input) .input$.x+ .input$.y)}, "2b2_lapplySubset" = { list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> lapply(FUN = function(.input, .f0 = `+`,...) .f0(.input$.x, .input$.y,...))}, "3a_mapIndex" = {list(as.list(c(1:(length(inputList)-1))), as.list(c(2:length(inputList)))) |> purrr::transpose() |> map(.f = function(x) inputList[[x[[1]]]]+ inputList[[x[[2]]]])}, "3a2_mapIndex" = {list(as.list(c(1:(length(inputList)-1))), as.list(c(2:length(inputList)))) |> purrr::transpose() |> map(.f = function(x, .f0 = `+`, ...) .f0(inputList[[x[[1]]]], inputList[[x[[2]]]],...))}, "3aReduce_mapIndex" = { {list(as.list(c(1:(length(inputList)-1))), as.list(c(2:length(inputList)))) |> purrr::transpose() |> map(.f = function(.index)purrr::reduce(list(inputList[[.index[[1]]]], inputList[[.index[[2]]]]), `+`))}}, "3b_mapSubset" = {list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> map(.f = function(x) x[[1]]+x[[2]])}, "3b2_mapSubset" = {list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> map(.f = function(x, .f0 = `+`, ...) .f0(x[[1]],x[[2]], ...))}, "3bReduce_mapSubset" = {list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> map(.f = reduce, `+`)}, "4a_map2Index" = {purrr::map2(.x = c(1:(length(inputList)-1)), .y = c(2:length(inputList)), .f = function(x,y) inputList[[x]] + inputList[[y]])}, "4a2_map2Index" = {purrr::map2(.x = c(1:(length(inputList)-1)), .y = c(2:length(inputList)), .f = function(x,y, .f0 = `+`, ...) .f0(inputList[[x]],inputList[[y]],...))}, "4b_map2Subset" = {purrr::map2(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))], `+`)}, "4b2_map2Subset" = {purrr::map2(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))], .f = function(x,y,.f0 = `+`, ...) .f0(x,y,...))}, "4Formula_map2Index" = {purrr::map2(.x = c(1:(length(inputList)-1)), .y = c(2:length(inputList)), .f = ~ inputList[[.x]]+inputList[[.y]])}, "5_mapmap2" = {list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> map(function(x) map2( .x = x$.x, .y = x$.y, `+`))}, "innermap" = innermap::innermap(inputList, .f0 = `+`), "innerLapply" = innerLapply(inputList, `+`), times= 100L ) bench |> print()
From the table above we can draw some pretty sweet conclusions:
- Using for
is very ineficient.
- Using reduce
inside purrr::map
is also very ineficient (even more than for)
- lapply
is faster then purrr::map
and purrr::map2
- innerLapply
is faster than innermap
We now select only those results that can be generalized and are not strongly inneficient:
bench2 <- microbenchmark::microbenchmark( "2a2_lapplyIndex" = {list(.x = as.list(c(1:(length(inputList)-1))), .y = as.list(c(2:length(inputList)))) |> purrr::transpose() |> lapply(FUN = function(.index, .f0 = `+`, ...) .f0(inputList[[.index$.x]],inputList[[.index$.y]],...))}, "2b2_lapplySubset" = { list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> lapply(FUN = function(.input, .f0 = `+`,...) .f0(.input$.x, .input$.y,...))}, "3a2_mapIndex" = {list(as.list(c(1:(length(inputList)-1))), as.list(c(2:length(inputList)))) |> purrr::transpose() |> map(.f = function(x, .f0 = `+`, ...) .f0(inputList[[x[[1]]]], inputList[[x[[2]]]],...))}, "3b2_mapSubset" = {list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> map(.f = function(x, .f0 = `+`, ...) .f0(x[[1]],x[[2]], ...))}, "4a2_map2Index" = {purrr::map2(.x = c(1:(length(inputList)-1)), .y = c(2:length(inputList)), .f = function(x,y, .f0 = `+`, ...) .f0(inputList[[x]],inputList[[y]],...))}, "4aFormula_map2Index" = {purrr::map2(.x = c(1:(length(inputList)-1)), .y = c(2:length(inputList)), .f = ~ inputList[[.x]]+inputList[[.y]])}, "innerLapply" = innerLapply(inputList, `+`), "innerLapply2" = innerLapply(inputList, function(x,y) x+y), times= 800L ) bench2 |> print()
innerLapply
is more restrictive then innermap
, since it was built upon lapply
and not map
. In this sense it does not make any use of pluck
and does not support functions written as formula.
But one can easily argue that writing explicitly the lengths of the vector in the arguments is at least very anoying, when very confusing to read. The innermap
approach do the same thing that was done in the example prior, but in the backstage. Let's use innermap_dbl
in this example:
It is undoubtely more pratical to do this than it is to do the former way. This also allows you to be even more pratical:
outputVec <- innermap::innermap_dbl(inputVec, sum) outputVec <- innermap::innermap_dbl(inputVec, `+`)
We used until now the innermap_dbl
function, but in the same way we have map
, map_lgl
, map_chr
,map_int
, map_raw
, map_dfr
, map_dfc
there is a innermap
, innermap_lgl
, innermap_chr
,innermap_int
, innermap_dbl
, innermap_raw
, innermap_dfr
, innermap_dfc
.
There is no native function in purrr
that takes elements of the same vector or list and applies a function to them. As a researcher in the field of Structural Decomposition Analysis (SDA), I use a purrr::map quite often, and need quite often functions like the ones from innermap
family. Since the Input-Output tables are quite heavy, there is no way one can give up the use of the optimized purrr
functions. Also, the SDA methodology needs as input the original matrix, making functions like accumulate
not suited for this.
I hope other will find this useful too! Really hope to receive comments about it!
We display below the code used to generate the microbenchmark of the options
library(microbenchmark) microbenchmark::microbenchmark( "1_For" = {outputList <- list() for(i in 1:(length(inputList)-1)){ outputList[[i]] <- inputList[[i]] + inputList[[i+1]] }}, "2a_lapplyIndex" = {list(.x = as.list(c(1:(length(inputList)-1))), .y = as.list(c(2:length(inputList)))) |> purrr::transpose() |> lapply(FUN = function(.index) inputList[[.index$.x]]+inputList[[.index$.y]])}, "2b_lapplySubset" = { list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> lapply(FUN = function(.input) .input$.x+ .input$.y)}, "3a1_mapIndex" = {list(as.list(c(1:(length(inputList)-1))), as.list(c(2:length(inputList)))) |> purrr::transpose() |> map(.f = function(x) inputList[[x[[1]]]]+ inputList[[x[[2]]]])}, "3aReduce_mapIndex" = { {list(as.list(c(1:(length(inputList)-1))), as.list(c(2:length(inputList)))) |> purrr::transpose() |> map(.f = function(.index)purrr::reduce(list(inputList[[.index[[1]]]], inputList[[.index[[2]]]]), `+`))}}, "3b_mapSubset" = {list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> map(.f = function(x) x[[1]]+x[[2]])}, "3bReduce_mapSubset" = {list(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))]) |> purrr::transpose() |> map(.f = reduce, `+`)}, "4a_map2Index" = {purrr::map2(.x = c(1:(length(inputList)-1)), .y = c(2:length(inputList)), .f = function(x,y) inputList[[x]] + inputList[[y]])}, "4b_map2Subset" = {purrr::map2(.x = inputList[c(1:(length(inputList)-1))], .y = inputList[c(2:length(inputList))], .f = `+`)}, "innermap" = innermap::innermap(inputList, .f0 = `+`), times= 100L )
Big thanks to pedroocava and kimjoaoun for their early comments on this matter. Any mistakes are my own.
These functions were much improved by the feedback and suggestions of Giuliano Sposito.
As I am an Economics PhD Student, I do not have the time, nor the skills, to make a hardcore low level coding in C++ to create super fast functions as is in the purrr
package. Nonetheless, I still think this functions I presented perfom quite well, since I kept them simple and based in the purrr
package. I already set an issue in the purrr
Github repository today. If they add this feature in the purrr
package, what I hope they do, I will not take this project forward or update it in anyway.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.