Applying Functions

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(matrixset)
library(tidyverse)
animals <- as.matrix(MASS::Animals)
log_animals <- log(animals)
animal_info <- MASS::Animals %>% 
  rownames_to_column("Animal") %>% 
  mutate(is_extinct = case_when(Animal %in% c("Dipliodocus", "Triceratops", "Brachiosaurus") ~ TRUE,
                                TRUE ~ FALSE),
         class = case_when(Animal %in% c("Mountain beaver", "Guinea pig", "Golden hamster", "Mouse", "Rabbit", "Rat") ~ "Rodent",
                           Animal %in% c("Potar monkey", "Gorilla", "Human", "Rhesus monkey", "Chimpanzee") ~ "Primate",
                           Animal %in% c("Cow", "Goat", "Giraffe", "Sheep") ~ "Ruminant",
                           Animal %in% c("Asian elephant", "African elephant") ~ "Elephantidae",
                           Animal %in% c("Grey wolf") ~ "Canine",
                           Animal %in% c("Cat", "Jaguar") ~ "Feline",
                           Animal %in% c("Donkey", "Horse") ~ "Equidae",
                           Animal == "Pig" ~ "Sus",
                           Animal == "Mole" ~ "Talpidae",
                           Animal == "Kangaroo" ~ "Macropodidae",
                           TRUE ~ "Dinosaurs")) %>% 
  select(-body, -brain)
animals_ms <- matrixset(msr = animals, log_msr = log_animals, row_info = animal_info,
                row_key = "Animal")
animals_ms <- animals_ms %>% 
  annotate_column(unit = case_when(.colname == "body" ~ "kg",
                                   TRUE ~ "g")) 

Apply Functions to matrixset Matrices

There are two ways to apply functions to the matrices of a matrixset object. The first one is through the apply_* family, which will be covered here.

The second is through mutate_matrix(), covered in the next section.

There are 3 functions in the apply_* family:

Each of these function will loop on the matrixset object's matrices to apply the functions. In the case of apply_row() and apply_column(), an additional loop on the margin (row or column, as applicable) is executed, so that the functions are applied to each matrix and margin.

To see the functions in action, we will use the following object:

animals_ms

We will use the following custom printing functions for compactness purposes.

show_matrix <- function(x) {
  if (nrow(x) > 4) {
    newx <- head(x, 4)
    storage.mode(newx) <- "character"
    newx <- rbind(newx, rep("...", ncol(x)))
  }  else newx <- x
  newx
}
show_vector <- function(x) {
  newx <- if (length(x) > 4) {
    c(as.character(x[1:4]), "...")
  } else x
  newx
}
show_lst <- function(x) {
  lapply(x, function(u) {
    if (is.matrix(u)) show_matrix(u) else if (is.vector(u)) show_vector(u) else u
  })
}

So now, let's see the apply_matrix() in action.

library(magrittr)
library(purrr)
out <- animals_ms %>% 
   apply_matrix(exp,
                ~ mean(.m, trim=.1),
                foo=asinh,
                pow = ~ 2^.m,
                reg = ~ {
                  is_alive <- !is_extinct
                  lm(.m ~ is_alive + class)
                  })
# out[[1]] %>% map(~ if (is.matrix(.x)) {head(.x, 5)} else .x)
show_lst(out[[1]])

We have showcased several features of the apply_* functions:

You probably have noticed the use of .m. This is a pronoun that is accessible inside apply_matrix() and refers to the current matrix in the internal loop. Similar pronouns exists for apply_row() and apply_column(), and they are respecticely .i and .j.

The returned object is a list of lists. The first layer is for each matrix and the second layer is for each function call.

Let's now showcase the row/column version with a apply_column() example:

out <- animals_ms %>% 
   apply_column(exp,
                ~ mean(.j, trim=.1),
                foo=asinh,
                pow = ~ 2^.j,
                reg = ~ {
                  is_alive <- !is_extinct
                  lm(.j ~ is_alive + class)
                  })
out[[1]] %>% map(show_lst)

The idea is similar, but in the returned object, there is a third list layer: the first layer for the matrices, the second layer for the columns (it would be rows for apply_row()) and the third layer for the functions.

Note as well the use of the .j pronoun instead of .m.

Grouped Data

The apply_* functions understand data grouping and will execute on the proper matrix/vector subsets.

animals_ms %>% 
  row_group_by(class) %>% 
  apply_matrix(exp,
               ~ mean(.m, trim=.1),
               foo=asinh,
               pow = ~ 2^.m,
               reg = ~ {
                 is_alive <- !is_extinct
                 lm(.m ~ is_alive)
                 })

As one can see, the output format differs in situation of grouping. We still end up with a list with an element for each matrix, but each of these element is now a tibble.

Each tibble has a column called .vals, where the function results are stored. This column is a list, one element per group. The group labels are given by the other columns of the tibble. For a given group, things are like the ungrouped version: further sub-lists for rows/columns - if applicable - and function values.

Simplified Results

Similar to the apply() function that has a simplify argument, the output structured can be simplified, baring two conditions:

If the conditions are met, each apply_* function has two simplified version available: _dfl and dfw.

Below is the _dfl flavor in action. We point out two things to notice:

animals_ms %>% 
    apply_matrix_dfl(~ mean(.m, trim=.1),
                     MAD=mad,
                     reg = ~ {
                         is_alive <- !is_extinct
                         list(lm(.m ~ is_alive + class))
                     })
animals_ms %>% 
    apply_column_dfl(~ mean(.j, trim=.1),
                     MAD=mad,
                     reg = ~ {
                         is_alive <- !is_extinct
                         list(lm(.j ~ is_alive + class))
                     })

If using apply_column_dfw in this context, you wouldn't notice a difference in output format.

The difference between the two lies when the vectors are of length > 1.

animals_ms %>% 
    apply_row_dfl(rg = ~ range(.i),
                  qt = ~ quantile(.i, probs = c(.25, .75)))   
animals_ms %>% 
    apply_row_dfw(rg = ~ range(.i),
                  qt = ~ quantile(.i, probs = c(.25, .75)))   

We can observe three things:

  1. dfl stands for long and stacks the elements of the function output into different rows, adding a column to identify the different elements.
  2. dfw stands for wide and put the elements of the function output into different columns.
  3. Element names are made unique if necessary.

Knowing the current context

It may happen that you need to get information about the current group. For this reason, the following context functions are made available:

For instance, a simple way of knowing the number of animals per group could be

animals_ms %>% 
    row_group_by(class) %>% 
    apply_matrix_dfl(n = ~ current_n_row()) %>% 
    .$msr

With common row and column annotation trait

The context functions can also be of use when one or more traits are shared (in name) between rows and columns.

Here's a pseudo-code example:

# ms_object %>% 
#     apply_matrix( ~ {
#       ctrt <- current_column_info()$common_trait
#       rtrt <- current_row_info()$common_trait
#       
#       do something with ctrt and rtrt
#     })

Pronouns, or dealing with ambiguous variables

It may happen that a variable in the calling environment shares its name with a trait of a matrixset object.

You can make it explicit which version of the variable you are using the pronouns .data (the trait annotation version) and .env.

Quasi quotation

reg_expr <- expr({
    is_alive <- !is_extinct
    list(lm(.j ~ is_alive + class))
})

animals_ms %>% 
    apply_column_dfl(~ mean(.j, trim=.1),
                     MAD=mad,
                     reg = ~ !!reg_expr)

Multivariate

mutate_matrix



Try the matrixset package in your browser

Any scripts or data that you put into this service are public.

matrixset documentation built on April 3, 2025, 6:32 p.m.