knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(matrixset)
It's a container of matrices, each having the same number of rows and columns and the same dimnames. Moreover, each dimname must uniquely identify elements.
Both matrix
and special matrices (from Matrix
package) objects can be used.
While there is minimal support for NULL
dimnames (and that is bound to change
at some point in the future), it is strongly recommended to provide meaningful
dimnames. One of the main reason for this is that annotation is impossible with
NULL
dimnames.
In addition, as alluded above, a matrixset
can store independent row and
column annotations. This meta information is stored, and available, in the form
of data frames - one for row information and one for column. The annotation
names are referred to as traits.
This latter feature makes matrixset
especially attractive even if it stores
only a single matrix, because several methods have been developed to manipulate
matrixset
s, accounting for annotations.
Many problems that matrixset
can tackle could be solved via a data.frame
and
more specifically using the tidyverse
suite.
Three reasons for which you may want to use a matrixset
instead are:
data.frame
needed to store the same information as a
matrixset
can be significantly bigger.Matrix
.
This is one of very few ways (maybe even the only way) to annotate special
matrices.
In addition, dealing with special matrices make the object size argument even
more striking when comparing to the data frame strategy.To illustrate how to build a matrixset
, we will use the animal dataset from
the MASS package.
animals <- as.matrix(MASS::Animals) head(animals)
This is a matrix with r nrow(animals)
rows and r ncol(animals)
columns.
There is two ways to create a matrixset
object: matrixset()
or
as_matrixset()
.
Let's first take a look at matrixset()
.
animals_ms <- matrixset(animals)
Oops! Looks like something didn't work. That's because matrices must be named
when building a matrixset
.
animals_ms <- matrixset(msr = animals) animals_ms
The as_matrixset()
function will provide a generic name.
as_matrixset(animals)
It is recommended to use the matrixset()
function and provide a meaningful
matrix name, as some functionalities will make use of matrix names. It will also
be easier to refer to the matrices when there is more than one.
We saw in the animals_ms
printout that default row annotation (row_info
) and
column annotation (column_info
) were created at the time of object building.
Meta info always contains the row names and column names. We will see later how to add more annotation.
Let's see an example on how to include more than one matrix.
log_animals <- log(animals) ms <- matrixset(msr = animals, log_msr = log_animals) ms
It may happen that the matrices are stored in a list. No problem, the list can
be used as input for matrixset
.
ms2 <- matrixset(list(msr = animals, log_msr = log_animals)) identical(ms, ms2)
It is also possible to add a matrix to an already existing object.
animals_ms <- add_matrix(animals_ms, log_msr = log_animals) identical(ms, animals_ms)
One way is to do it at the object creation step. Suppose we have the following animal information (which are rows in our object).
library(tidyverse) animal_info <- MASS::Animals %>% rownames_to_column("Animal") %>% mutate(is_extinct = case_when(Animal %in% c("Dipliodocus", "Triceratops", "Brachiosaurus") ~ TRUE, TRUE ~ FALSE), class = case_when(Animal %in% c("Mountain beaver", "Guinea pig", "Golden hamster", "Mouse", "Rabbit", "Rat") ~ "Rodent", Animal %in% c("Potar monkey", "Gorilla", "Human", "Rhesus monkey", "Chimpanzee") ~ "Primate", Animal %in% c("Cow", "Goat", "Giraffe", "Sheep") ~ "Ruminant", Animal %in% c("Asian elephant", "African elephant") ~ "Elephantidae", Animal %in% c("Grey wolf") ~ "Canine", Animal %in% c("Cat", "Jaguar") ~ "Feline", Animal %in% c("Donkey", "Horse") ~ "Equidae", Animal == "Pig" ~ "Sus", Animal == "Mole" ~ "Talpidae", Animal == "Kangaroo" ~ "Macropodidae", TRUE ~ "Dinosaurs")) %>% select(-body, -brain) animal_info
We can then use the matrixset()
function (it works with as_matrixset()
too).
ms <- matrixset(msr = animals, log_msr = log_animals, row_info = animal_info, row_key = "Animal") ms
Notice how we used the row_key
argument to specify how to link the two objects
together.
Another option is the row_info<-()
function. Note that here you need a column
with the same tag as in the matrixset
object.
row_info(ms2) <- animal_info %>% rename(.rowname = Animal) identical(ms, ms2)
We'll finish the annotation section by introducing the join
and the annotate
methods.
animals_ms <- animals_ms %>% join_row_info(animal_info, by = c(".rowname" = "Animal")) identical(ms, animals_ms) animals_ms <- animals_ms %>% annotate_column(unit = case_when(.colname == "body" ~ "kg", TRUE ~ "g")) animals_ms
The join operation joins the second meta info (animal_info
in our example) to
the relevant matrixset
meta info.
The annotate
operation works like dplyr's mutate()
does: you can modify
existing annotation, create new ones or even remove them. It can use already
existing annotation as well.
animals_ms %>% annotate_column(us_unit = case_when(unit == "kg" ~ "lb", TRUE ~ "oz")) %>% column_info()
The only annotation that can't be modify or deleted via annotate
is the the
tag (typically, .rowname
and .colname
).
animals_ms %>% annotate_column(.colname = NULL)
There is one last method to provide annotation: annotate_row_from_apply()
/
annotate_column_from_apply()
. With this method, a matrix operation (for
instance, the row average) can be stored as an annotation.
We strongly encourage you to have a look at the functions, and also
apply_row_dfw()/apply_column_dfw()
, upon which they are based.
Many of the array
operations are available for matrixset
:
[
and [[
. The subset [
works like that of an array. The first argument
is for rows, the second for columns and the third for matrices. It will
return a matrixset
, even if there is a single matrix returned as part of
the subset. The keep_annotation
argument (that always should be used
after the 3 row/col/matrix arguments) will return only the matrix list.The operation x[[m]]
is a wrapper to x[,,m]
.
* Replacement method [<-
Replaces matrix or matrices, in full or in part, by new values.
* dimnames()
and the related rownames()
and colnames()
.
* The replacement versions: dimnames<-()
and the related rownames<-()
and
colnames<-()
. Note that rownames<-()
and colnames<-()
are what should
be used for names replacement. The special annotation tag is modified this
way.
* dim()
and the related nrow()
and ncol()
Other property methods specific to the matrixset
object are available:
matrixnames()
(and the replacement version, matrixnames<-()
) and
nmatrix()
matrix_elm()
: extracts a single matrix of the object, and returns it as a
matrixrow_info()
and column_info()
: extracts the data frames that stores the
annotationrow_traits()
and column_traits()
: returns the annotation names. The tag
(column that stores the row/column names) is not returned.matrix_elm<-()
: replaces a single matrix of the objectrow_info()<-
and column_info()<-
: relaces whole annotation data framecolumn_traits()
: replace annotation names. Tags cannot
be replacedadd_matrix()
and remove_matrix()
.Finally, many functions to manipulate or use the matrixset
object, often taking
inspiration from the tidyverse
, have been made available
mutate_matrix()
: create, modify or delete matrices. Supports tidy
evaluation and tidy selection (for matrix names and annotations)apply_matrix()
, apply_row()
and apply_column()
(and all of their
declinations): apply functions/expressions to matrices, their rows or their
columns. They have intuitive and direct access to annotations.arrange_row()
and arrange_column()
: reorder rows and columns. Supports
tidy selection for annotationfilter_row()
and filter_column()
: subset rows or columns of matrices
based on conditions that may use annotations.All of these functions (except for mutate_matrix()
) can be performed within
groups - groups that could be defined using annotation. This is done by a prior
call to row_group_by()
and/or column_group_by()
.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.