| gather_rvars | R Documentation |
Extract draws from a Bayesian model for one or more variables (possibly with named dimensions) into one of two types of long-format data frames of posterior::rvar objects.
gather_rvars(model, ..., ndraws = NULL, seed = NULL)
spread_rvars(model, ..., ndraws = NULL, seed = NULL)
model |
A supported Bayesian model fit. Tidybayes supports a variety of model objects; for a full list of supported models, see tidybayes-models. |
... |
Expressions in the form of
|
ndraws |
The number of draws to return, or |
seed |
A seed to use when subsampling draws (i.e. when |
Imagine a JAGS or Stan fit named model. The model may contain a variable named
b[i,v] (in the JAGS or Stan language) with dimension i in 1:100 and
dimension v in 1:3. However, the default format for draws returned from
JAGS or Stan in R will not reflect this indexing structure, instead
they will have multiple columns with names like "b[1,1]", "b[2,1]", etc.
spread_rvars and gather_rvars provide a straightforward
syntax to translate these columns back into properly-indexed rvars in two different
tidy data frame formats, optionally recovering dimension types (e.g. factor levels) as it does so.
spread_rvars will spread names of variables in the model across the data frame as column names,
whereas gather_rvars will gather variable names into a single column named ".variable" and place
values of variables into a column named ".value". To use naming schemes from other packages
(such as broom), consider passing
results through functions like to_broom_names() or to_ggmcmc_names().
For example, spread_rvars(model, a[i], b[i,v]) might return a data frame with:
column "i": value in 1:5
column "v": value in 1:10
column "a": rvar containing draws from "a[i]"
column "b": rvar containing draws from "b[i,v]"
gather_rvars(model, a[i], b[i,v]) on the same model would return a data frame with:
column "i": value in 1:5
column "v": value in 1:10, or NA
on rows where ".variable" is "a".
column ".variable": value in c("a", "b").
column ".value": rvar containing draws from "a[i]" (when ".variable" is "a")
or "b[i,v]" (when ".variable" is "b")
spread_rvars and gather_rvars can use type information
applied to the model object by recover_types() to convert columns
back into their original types. This is particularly helpful if some of the dimensions in
your model were originally factors. For example, if the v dimension
in the original data frame data was a factor with levels c("a","b","c"),
then we could use recover_types before spread_rvars:
model %>% recover_types(data) spread_rvars(model, b[i,v])
Which would return the same data frame as above, except the "v" column
would be a value in c("a","b","c") instead of 1:3.
For variables that do not share the same subscripts (or share
some but not all subscripts), we can supply their specifications separately.
For example, if we have a variable d[i] with the same i subscript
as b[i,v], and a variable x with no subscripts, we could do this:
spread_rvars(model, x, d[i], b[i,v])
Which is roughly equivalent to this:
spread_rvars(model, x) %>% inner_join(spread_rvars(model, d[i])) %>% inner_join(spread_rvars(model, b[i,v]))
Similarly, this:
gather_rvars(model, x, d[i], b[i,v])
Is roughly equivalent to this:
bind_rows( gather_rvars(model, x), gather_rvars(model, d[i]), gather_rvars(model, b[i,v]) )
The c and cbind functions can be used to combine multiple variable names that have
the same dimensions. For example, if we have several variables with the same
subscripts i and v, we could do either of these:
spread_rvars(model, c(w, x, y, z)[i,v])
spread_rvars(model, cbind(w, x, y, z)[i,v]) # equivalent
Each of which is roughly equivalent to this:
spread_rvars(model, w[i,v], x[i,v], y[i,v], z[i,v])
Besides being more compact, the c()-style syntax is currently also slightly
faster (though that may change).
Dimensions can be left nested in the resulting rvar objects by leaving their names
blank; e.g. spread_rvars(model, b[i,]) will place the first index (i) into
rows of the data frame but leave the second index nested in the b column
(see Examples below).
A data frame.
Matthew Kay
spread_draws(), recover_types(), compose_data(). See also
posterior::rvar() and posterior::as_draws_rvars(), the functions that power
spread_rvars and gather_rvars.
library(dplyr)
data(RankCorr, package = "ggdist")
RankCorr %>%
spread_rvars(b[i, j])
# leaving an index out nests the index in the column containing the rvar
RankCorr %>%
spread_rvars(b[i, ])
RankCorr %>%
spread_rvars(b[i, j], tau[i], u_tau[i])
# gather_rvars places variables and values in a longer format data frame
RankCorr %>%
gather_rvars(b[i, j], tau[i], typical_r)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.