View source: R/fsubset_ftransform_fmutate.R
across | R Documentation |
across()
can be used inside fmutate
and fsummarise
to apply one or more functions to a selection of columns. It is overall very similar to dplyr::across
, but does not support some rlang
features, has some additional features (arguments), and is optimized to work with collapse's, .FAST_FUN
, yielding much faster computations.
across(.cols = NULL, .fns, ..., .names = NULL,
.apply = "auto", .transpose = "auto")
# acr(...) can be used to abbreviate across(...)
.cols |
select columns using column names and expressions (e.g. |
.fns |
A function, character vector of functions or list of functions. Vectors / lists can be named to yield alternative names in the result (see |
... |
further arguments to |
.names |
controls the naming of computed columns. |
.apply |
controls whether functions are applied column-by-column ( |
.transpose |
with multiple |
across()
does not support purr-style lambdas, and does not support dplyr
-style predicate functions e.g. across(where(is.numeric), sum)
, simply use across(is.numeric, sum)
. In contrast to dplyr
, you can also compute on grouping columns.
Also note that across()
is NOT a function in collapse but a known expression that is internally transformed by fsummarise()/fmutate()
into something else. Thus, it cannot be called using qualified names, i.e., collapse::across()
does not work and is not necessary if collapse is not attached.
fsummarise
, fmutate
, Fast Data Manipulation, Collapse Overview
# Basic (Weighted) Summaries
fsummarise(wlddev, across(PCGDP:GINI, fmean, w = POP))
wlddev |> fgroup_by(region, income) |>
fsummarise(across(PCGDP:GINI, fmean, w = POP))
# Note that for these we don't actually need across...
fselect(wlddev, PCGDP:GINI) |> fmean(w = wlddev$POP, drop = FALSE)
wlddev |> fgroup_by(region, income) |>
fselect(PCGDP:GINI, POP) |> fmean(POP, keep.w = FALSE)
collap(wlddev, PCGDP + LIFEEX + GINI ~ region + income, w = ~ POP, keep.w = FALSE)
# But if we want to use some base R function that reguires argument splitting...
wlddev |> na_omit(cols = "POP") |> fgroup_by(region, income) |>
fsummarise(across(PCGDP:GINI, weighted.mean, w = POP, na.rm = TRUE))
# Or if we want to apply different functions...
wlddev |> fgroup_by(region, income) |>
fsummarise(across(PCGDP:GINI, list(mu = fmean, sd = fsd), w = POP),
POP_sum = fsum(POP), OECD = fmean(OECD))
# Note that the above still detects fmean as a fast function, the names of the list
# are irrelevant, but the function name must be typed or passed as a character vector,
# Otherwise functions will be executed by groups e.g. function(x) fmean(x) won't vectorize
# Same, naming in a different way
wlddev |> fgroup_by(region, income) |>
fsummarise(across(PCGDP:GINI, list(mu = fmean, sd = fsd), w = POP, .names = "flip"),
sum_POP = fsum(POP), OECD = fmean(OECD))
# Or we want to do more advanced things..
# Such as nesting data frames..
qTBL(wlddev) |> fgroup_by(region, income) |>
fsummarise(across(c(PCGDP, LIFEEX, ODA),
function(x) list(Nest = list(x)),
.apply = FALSE))
# Or linear models..
qTBL(wlddev) |> fgroup_by(region, income) |>
fsummarise(across(c(PCGDP, LIFEEX, ODA),
function(x) list(Mods = list(lm(PCGDP ~., x))),
.apply = FALSE))
# Or cumputing grouped correlation matrices
qTBL(wlddev) |> fgroup_by(region, income) |>
fsummarise(across(c(PCGDP, LIFEEX, ODA),
function(x) qDF(pwcor(x), "Variable"), .apply = FALSE))
# Here calculating 1- and 10-year lags and growth rates of these variables
qTBL(wlddev) |> fgroup_by(country) |>
fmutate(across(c(PCGDP, LIFEEX, ODA), list(L, G),
n = c(1, 10), t = year, .names = FALSE))
# Same but variables in different order
qTBL(wlddev) |> fgroup_by(country) |>
fmutate(across(c(PCGDP, LIFEEX, ODA), list(L, G), n = c(1, 10),
t = year, .names = FALSE, .transpose = FALSE))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.