loop | R Documentation |
The apply_matrix
function applies functions to each matrix of a matrixset
.
The apply_row
/apply_column
functions do the same but separately for each
row/column. The functions can be applied to all matrices or only a subset.
The dfl
/dfw
versions differ in their output format and when possible,
always return a tibble::tibble()
.
Empty matrices are simply left unevaluated. How that impacts the returned result depends on which flavor of apply_* has been used. See ‘Value’ for more details.
If .matrix_wise
is FALSE
, the function (or expression) is multivariate in
the sense that all matrices are accessible at once, as opposed to each of them
in turn.
See section "Multivariate".
apply_row(.ms, ..., .matrix = NULL, .matrix_wise = TRUE, .input_list = FALSE)
apply_row_dfl(
.ms,
...,
.matrix = NULL,
.matrix_wise = TRUE,
.input_list = FALSE,
.force_name = FALSE
)
apply_row_dfw(
.ms,
...,
.matrix = NULL,
.matrix_wise = TRUE,
.input_list = FALSE,
.force_name = FALSE
)
apply_column(
.ms,
...,
.matrix = NULL,
.matrix_wise = TRUE,
.input_list = FALSE
)
apply_column_dfl(
.ms,
...,
.matrix = NULL,
.matrix_wise = TRUE,
.input_list = FALSE,
.force_name = FALSE
)
apply_column_dfw(
.ms,
...,
.matrix = NULL,
.matrix_wise = TRUE,
.input_list = FALSE,
.force_name = FALSE
)
apply_matrix(
.ms,
...,
.matrix = NULL,
.matrix_wise = TRUE,
.input_list = FALSE
)
apply_matrix_dfl(
.ms,
...,
.matrix = NULL,
.matrix_wise = TRUE,
.input_list = FALSE,
.force_name = FALSE
)
apply_matrix_dfw(
.ms,
...,
.matrix = NULL,
.matrix_wise = TRUE,
.input_list = FALSE,
.force_name = FALSE
)
A list for every matrix in the matrixset object. Each list is itself a
list, or NULL
for NULL
matrices. For apply_matrix
, it is a list of
the function values. Otherwise, it is a list with one element for each
row/column. And finally, for apply_row
/apply_column
, each of these
sub-list is a list, the results of each function.
When .matrix_wise == FALSE
, the output format differs only in that there is
no list for matrices.
If each function returns a vector
of the same dimension, you can use either
the _dfl
or the _dfw
version. What they do is to return a list of
tibble
s. The dfl
version will stack the function results in a long format
while the dfw
version will put them side-by-side, in a wide format. An
empty matrix will be returned for empty input matrices.
If the functions returned vectors of more than one element, there will be a column to store the values and one for the function ID (dfl), or one column per combination of function/result (dfw)
See the grouping section to learn about the result format in the grouping context.
The rlang
pronouns .data
and .env
are available. Two scenarios for
which they can be useful are:
The annotation names are stored in a character variable. You can make use
of the variable by using .data[[var]]
. See the example for an
illustration of this.
You want to make use of a global variable that has the same name as an
annotation. You can use .env[[var]]
or .env$var
to make sure to use
the proper variable.
The matrixset package defines its own pronouns: .m,
.i and .j, which
are discussed in the function specification argument (...
).
It is not necessary to import any of the pronouns (or load rlang
in the
case of .data
and .env
) in a interactive session.
It is useful however when writing a package to avoid the R CMD check
notes.
As needed, you can import .data
and .env
(from rlang
) or any of .m,
.i or .j from matrixset
.
The default behavior is to apply a function or expression to a single
matrix and each matrices of the matrixset
object are provided sequentially
to the function/expression.
If .matrix_wise
is FALSE
, all matrices are provided at once to the
functions/expressions. They can be provided in two fashions:
separately (default behavior). Each matrix can be referred by .m1
, ...,
.mn
, where n
is the number of matrices. Note that this is the number
as determined by .matrix
.
For apply_row
(and dfl/dfw variants), use .i1
, .i2
and so on
instead. What the functions/expressions have access to in this case is
the first row of the first matrix, the first row of the second matrix
and so on. Then, continuing the loop, the second row of each matrix
will be accessible, and so on
Similarly, use .j1
and so on for the apply_column
family.
Anonymous functions will be understood as a function with multiple
arguments. In the example apply_row(ms, mean, .matrix_wise = FALSE)
,
if there are 3 matrices in the ms
object, mean
is understood as
mean(.i1, .i2, .i3)
. Note that this would fail because of the mean
function.
In a list (.list_input = TRUE
). The list will have an element per matrix.
The list can be referred using the same pronouns (.m
, .i
, .j
), and
the matrix, by the matrix names or position.
For the multivariate setting, empty matrices are given as is, so it is
important that provided functions can deal with such a scenario. An
alternative is to skip the empty matrices with the .matrix
argument.
If groups have been defined, functions will be evaluated within them. When both row and column grouping has been registered, functions are evaluated at each cross-combination of row/column groups.
The output format is different when the .ms
matrixset object is grouped.
A list for every matrix is still returned, but each of these lists now holds
a tibble.
Each tibble has a column called .vals
, where the function results are
stored. This column is a list, one element per group. The group labels are
given by the other columns of the tibble. For a given group, things are like
the ungrouped version: further sub-lists for rows/columns - if applicable -
and function values.
The dfl/dfw versions are more similar in their output format to their ungrouped version. The format is almost identical, except that additional columns are reported to identify the group labels.
See the examples.
# The firs example takes the whole matrix average, while the second takes
# every row average
(mn_mat <- apply_matrix(student_results, mean))
(mn_row <- apply_row(student_results, mean))
# More than one function can be provided. It's a good idea in this case to
# name them
(mn_col <- apply_column(student_results, avr=mean, med=median))
# the dfl/dfw versions returns nice tibbles - if the functions return values
# of the same length.
(mn_l <- apply_column_dfl(student_results, avr=mean, med=median))
(mn_w <- apply_column_dfw(student_results, avr=mean, med=median))
# There is no difference between the two versions for length-1 vector results.
# hese will differ, however
(rg_l <- apply_column_dfl(student_results, rg=range))
(rg_w <- apply_column_dfw(student_results, rg=range))
# More complex examples can be used, by using pronouns and data annotation
(vals <- apply_column(student_results, avr=mean, avr_trim=~mean(.j, trim=.05),
reg=~lm(.j ~ teacher)))
# You can wrap complex function results, such as for lm, into a list, to use
# the dfl/dfr version
(vals_tidy <- apply_column_dfw(student_results, avr=mean, avr_trim=~mean(.j, trim=.05),
reg=~list(lm(.j ~ teacher))))
# You can provide complex expressions by using formulas
(r <- apply_column(student_results,
res= ~ {
log_score <- log(.j)
p <- predict(lm(log_score ~ teacher + class))
.j - exp(p)
}))
# the .data pronoun can be useful to use names stored in variables
fn <- function(nm) {
if (!is.character(nm) && length(nm) != 1) stop("this example won't work")
apply_column(student_results, ~lm(.j ~ .data[[nm]]))
}
fn("teacher")
# You can use variables that are outside the scope of the matrixset object.
# You don't need to do anything special if that variable is not named as an
# annotation
pass_grade <- 0.5
(passed <- apply_row_dfw(student_results, pass = ~ .i >= pass_grade))
# use .env if shares an annotation name
previous_year_score <- 0.5
(passed <- apply_row_dfw(student_results, pass = ~ .i >= .env$previous_year_score))
# Grouping structure makes looping easy. Look at the output format
cl_prof_gr <- row_group_by(student_results, class, teacher)
(gr_summ <- apply_column(cl_prof_gr, avr=mean, med=median))
(gr_summ_tidy <- apply_column_dfw(cl_prof_gr, avr=mean, med=median))
# to showcase how we can play with format
(gr_summ_tidy_long <- apply_column_dfl(cl_prof_gr, summ = ~ c(avr=mean(.j), med=median(.j))))
# It is even possible to combine groupings
cl_prof_program_gr <- column_group_by(cl_prof_gr, program)
(mat_summ <- apply_matrix(cl_prof_program_gr, avr = mean, med = median, rg = range))
# it doesn' make much sense, but this is to showcase format
(summ_gr <- apply_matrix(cl_prof_program_gr, avr = mean, med = median, rg = range))
(summ_gr_long <- apply_column_dfl(cl_prof_program_gr,
ct = ~ c(avr = mean(.j), med = median(.j)),
rg = range))
(summ_gr_wide <- apply_column_dfw(cl_prof_program_gr,
ct = ~ c(avr = mean(.j), med = median(.j)),
rg = range))
# This is an example where you may want to use the .force_name argument
(apply_matrix_dfl(column_group_by(student_results, program), FC = ~ colMeans(.m)))
(apply_matrix_dfl(column_group_by(student_results, program), FC = ~ colMeans(.m),
.force_name = TRUE))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.