The goal of {indexvctrs} is to provide a DSL for indexed vector operations
Pretty experimental right now!
devtools::install_github("jameelalsalam/indexvctrs")
This package provides idx_tbl
objects which behave somewhat like
sparse multi-dimensional arrays with named indices, but remain data
frames as well. One of the columns is the value
column, which is
manipulated directly through math operations.
When idx_tbl
objects are created, columns are marked as index columns
(the idx_cols attribute), and a single value column is stored as
value
. Additional columns are dropped.
library(indexvctrs)
crop_acres <- idx_tibble(
tibble::tribble(
~crop, ~year, ~value,
"corn", 2005, 4,
"wheat", 2005, 5,
"corn", 2010, 6,
"wheat", 2010, 8
),
idx_cols = c("crop", "year")
)
EF <- idx_tibble(
tibble::tribble(
~crop, ~value,
"corn", 1.5,
"wheat", 2
), idx_cols = "crop"
)
idx_cols(crop_acres)
#> [1] "crop" "year"
crop_acres
#> # A tibble: 4 × 3
#> crop year value
#> <chr> <dbl> <dbl>
#> 1 corn 2005 4
#> 2 corn 2010 6
#> 3 wheat 2005 5
#> 4 wheat 2010 8
Math operations can be performed between idx_tbls with common indices via join semantics. The result of this is that where indices do not share an axis, values are broadcast (or recycled) across that axis. This applies, for example, to scalars.
EF * crop_acres
#> # A tibble: 4 × 3
#> crop year value
#> <chr> <dbl> <dbl>
#> 1 corn 2005 6
#> 2 corn 2010 9
#> 3 wheat 2005 10
#> 4 wheat 2010 16
EF * 2
#> # A tibble: 2 × 2
#> crop value
#> <chr> <dbl>
#> 1 corn 3
#> 2 wheat 4
It can be convenient to express relationships using functions (e.g., so that they can be stated out-of-order):
calc_emissions <- function(activity, EF) {activity * EF}
emissions_by_crop <- calc_emissions(activity = crop_acres, EF = EF * 2)
emissions_by_crop
#> # A tibble: 4 × 3
#> crop year value
#> <chr> <dbl> <dbl>
#> 1 corn 2005 12
#> 2 corn 2010 18
#> 3 wheat 2005 20
#> 4 wheat 2010 32
Vector operations act on the value
column:
sum(emissions_by_crop)
#> # A tibble: 1 × 1
#> value
#> <dbl>
#> 1 82
If something isn’t implemented, never fear because you should be able to drop back to data frame operations.
library(tidyverse)
#> ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
#> ✔ dplyr 1.1.1 ✔ readr 2.1.4
#> ✔ forcats 1.0.0 ✔ stringr 1.5.0
#> ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
#> ✔ lubridate 1.9.3 ✔ tidyr 1.3.0
#> ✔ purrr 1.0.2
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
emissions_by_crop %>%
filter(year == 2010)
#> # A tibble: 2 × 3
#> crop year value
#> <chr> <dbl> <dbl>
#> 1 corn 2010 18
#> 2 wheat 2010 32
I hope that this package benefits from learning from some of these:
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.