to_named_triplet | R Documentation |
Matrices can be in named matrix form or triplet form.
Named matrix form is the usual representation for the matsindf
package,
wherein names for rows and columns are included in the dimnames
attribute of every matrix or Matrix object, consuming memory.
Typically, neither zero rows nor zero columns are present.
In some instances,
many sparse matrices with the same names will be created,
leading to inefficiencies due to dimname storage with every matrix object.
It can be more memory-efficient to store named matrices in
integer triplet form,
(a table format with matrix data represented as
a data frame with
row integer (i),
column integer (j), and
value (value) columns.
(Row names and column names can be stored as character string
in the i
and j
columns, too,
called character triplet form.)
Integer triplet form is required for databases that
do not recognize a matrix as a native storage format.
In integer triplet form,
a separate (external) mapping between
row and column indices and row and column names
must be maintained.
(In integer triplet form, it becomes the responsibility of the caller
to maintain a consistent mapping between row and column indices
and row and column names.
However, rowtype and coltype are retained as attributes
of both integer and character triplet data frames.)
These functions convert from named matrix form to integer triplet form
(to_triplet()
)
and vice versa (to_named_matrix()
)
using row and column name mappings
supplied in the index_map
argument.
to_triplet()
and to_named_matrix()
are inverses of each other,
with row and column order not necessarily preserved.
See examples.
to_triplet(
a,
index_map,
retain_zero_structure = FALSE,
row_index_colname = "i",
col_index_colname = "j",
value_colname = "value",
rownames_colname = "rownames",
colnames_colname = "colnames"
)
to_named_matrix(
a,
index_map,
matrix_class = c("matrix", "Matrix"),
row_index_colname = "i",
col_index_colname = "j",
value_colname = "value",
.rnames = "rownames",
.cnames = "colnames"
)
a |
For |
index_map |
A mapping between row and column names and row and column indices. See details. |
retain_zero_structure |
A boolean that tells whether to retain
the structure of zero matrices when creating triplets.
Default is |
row_index_colname , col_index_colname |
The names of row and column index columns in data frames. Defaults are "i" and "j", respectively. |
value_colname |
The name of the value column in data frames. Default is "value". |
rownames_colname , colnames_colname |
The name of row name and column name columns in data frames. Defaults are "rownames" and "colnames", respectively. |
matrix_class |
One of "matrix" (standard) or "Matrix" (sparse) representation for matrices. Default is "matrix". |
.rnames , .cnames |
Column names used internally. Defaults are "rownames" and "colnames". |
index_map
must be one of the following:
A single data frame of two columns, with one an integer column and the other a character column. When a single data frame, it will be applied to both rows and columns.
An unnamed list of exactly two data frames,
each data frame must have only
an integer column and a character column.
The first data frame of index_map
is interpreted as the mapping
between row names and row indices
and
the second data frame of index_map
is interpreted as the mapping
between column names and column indices.
A named list of two or more data frames,
in which the names of index_map
are interpreted as row and column types,
with named data frames applied as the mapping for the
associated row or column type.
For example the data frame named "Industry" would be applied
to the dimension (row or column)
with an "Industry" type.
When both row and column have "Industry" type,
the "Industry" mapping is applied to both.
When sending named data frames in index_map
,
a
must have both a row type and a column type.
If an appropriate mapping cannot be found in index_map
,
an error is raised.
Both matching data frames must have only
an integer column and
a character column.
When converting to indexed form,
rowtype and coltype
are retained as attributes.
See rowtype()
and coltype()
.
If any indices are unavailable in the index_map
,
an error is raised.
It is an error to repeat a name in the name column of an index_map
data frame.
It is an error to repeat an index in the index column
of an index_map
data frame.
If a
is NULL
, NULL
is returned.
If a
is a list and any member of the list is NULL
,
NULL
is returned in that position.
By default, to_triplet()
will return
a zero-row data frame when
a
is a zero matrix.
Set retain_zero_structure = TRUE
to return all entries in the zero matrix.
to_triplet()
returns a
as a data frame or list of data frames in triplet form.
to_named_matrix()
returns a
as a named matrix or a list of matrices in named form.
triplet <- data.frame(i = as.integer(c(9, 7, 5, 9, 7, 5)),
j = as.integer(c(3, 3, 3, 4, 4, 4)),
value = c(1, 2, 3, 4, 5, 6)) |>
setrowtype("rows") |> setcoltype("cols")
triplet
rowtype(triplet)
coltype(triplet)
# We have more indices than actual entries in the martix
r_indices <- data.frame(names = paste0("r", 1:101),
indices = 1:101)
head(r_indices)
c_indices <- data.frame(names = paste0("c", 1:101),
indices = 1:101)
head(c_indices)
# Names are interpreted as row and column types
indices <- list(cols = c_indices, rows = r_indices)
named <- to_named_matrix(triplet, indices)
named
triplet2 <- to_triplet(named, indices)
# Although not in the same row order,
# triplet and triplet2 are the same.
triplet2
rowtype(triplet2)
coltype(triplet2)
# And the same matrix can be recovered from triplet2
to_named_matrix(triplet2, indices)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.