knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(matrixset)
We will use the same example as in the introduction vignette ,
the Animals
object.
library(tidyverse) animals <- as.matrix(MASS::Animals) log_animals <- log(animals) animal_info <- MASS::Animals %>% rownames_to_column("Animal") %>% mutate(is_extinct = case_when(Animal %in% c("Dipliodocus", "Triceratops", "Brachiosaurus") ~ TRUE, TRUE ~ FALSE), class = case_when(Animal %in% c("Mountain beaver", "Guinea pig", "Golden hamster", "Mouse", "Rabbit", "Rat") ~ "Rodent", Animal %in% c("Potar monkey", "Gorilla", "Human", "Rhesus monkey", "Chimpanzee") ~ "Primate", Animal %in% c("Cow", "Goat", "Giraffe", "Sheep") ~ "Ruminant", Animal %in% c("Asian elephant", "African elephant") ~ "Elephantidae", Animal %in% c("Grey wolf") ~ "Canine", Animal %in% c("Cat", "Jaguar") ~ "Feline", Animal %in% c("Donkey", "Horse") ~ "Equidae", Animal == "Pig" ~ "Sus", Animal == "Mole" ~ "Talpidae", Animal == "Kangaroo" ~ "Macropodidae", TRUE ~ "Dinosaurs")) %>% select(-body, -brain)
Annotations are internally stored as [tibble::tibble()] objects and can be viewed as simple data bases. As such, a key is needed to uniquely identify the rows or the columns. This key is the rownames for row annotation and colnames for column annotation.
This key is called the tag and, unless specified otherwise at the matrixset
creation, is stored as .rowname
/.colname
. This special tag can almost be
used as any other annotation traits - see Applying Functions.
When using an external data.frame
to create new annotations, the data frame
must contain this key - it doesn't have to be called .rowname
/.colname
- in
a single column.
Moreover, the key values must correspond to rownames/colnames. Values that do not match will simply left out.
To use the annotation at creation, simply use a command like this
ms <- matrixset(msr = animals, log_msr = log_animals, row_info = animal_info, row_key = "Animal") ms
Notice how we used the row_key
argument to specify how to link the two objects
together.
The internal tibble
can be replaced by a new one. This could be an interesting
possibility to add annotations to an existing matrixset
object where none were
registered.
To do so, you can simply do
row_info(ms) <- animal_info %>% rename(.rowname = Animal)
For the operation to work, a column called .rowname
(or more generally, what
is returned by row_tag()
) must be part of the data frame.
The column equivalents are column_info
and column_tag
.
Annotation tibble replacement works even if annotations were registered. Be aware of two things:
This is equivalent to performing a mutating join (default: [dplyr::left_join()],
though all mutating joins - except cross-joins - are available via the type
argument) between the matrixset
(.ms
) object's annotation tibble
and a
data.frame
(.y
).
The by
argument will determine how to join the two data.frame
s together, so
it is not necessary for y
to have a .rowname
/.colname
column. But when the
by
argument is not provided, a natural join is performed.
One behavior that differs with a true mutating join, is that when a row from
.ms
matches more than one row in .y
, no row duplication will be performed.
Instead, a condition error will be issued. This is to preserve the matrixset
property that all row names (and column names) must be unique.
matrixset(msr = animals, log_msr = log_animals) %>% join_row_info(animal_info, by = c(".rowname" = "Animal"))
Indeed! In using join_row_info()
/join_column_info()
, .y
can be a
matrixset
object, in which case the appropriate annotation tibble
will be
used.
The only difference is when using the default by = NULL
argument. In that case
the row/column tag of each object is used.
If you are familiar with dplyr::mutate()
, then you know almost everything you
need to know about using annotate_row()
and annotate_column()
.
ms <- matrixset(msr = animals, log_msr = log_animals) %>% join_row_info(animal_info, by = c(".rowname" = "Animal")) %>% annotate_column(unit = case_when(.colname == "body" ~ "kg", TRUE ~ "g")) %>% annotate_column(us_unit = case_when(unit == "kg" ~ "lb", TRUE ~ "oz")) column_info(ms)
You can decide that you don't need two unit systems and keep only one
ms <- ms %>% annotate_column(us_unit = NULL) column_info(ms)
Applying functions to a matrixset
's matrices is covered in the
Applying Functions vignette.
The idea here is the same, but with the added benefit that the function result
is stored directly as annotation for the matrixset
object.
ms %>% annotate_row_from_apply(msr, ratio_brain_body = ~ .i[2]/(10*.i[1])) %>% row_info()
When groups are registered, results are spread using tidyr::pivot_wider()
.
ms %>% row_group_by(class) %>% annotate_column_from_apply(msr, mean) %>% column_info()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.