gather_rvars | R Documentation |
Extract draws from a Bayesian model for one or more variables (possibly with named dimensions) into one of two types of long-format data frames of posterior::rvar objects.
gather_rvars(model, ..., ndraws = NULL, seed = NULL)
spread_rvars(model, ..., ndraws = NULL, seed = NULL)
model |
A supported Bayesian model fit. Tidybayes supports a variety of model objects; for a full list of supported models, see tidybayes-models. |
... |
Expressions in the form of
|
ndraws |
The number of draws to return, or |
seed |
A seed to use when subsampling draws (i.e. when |
Imagine a JAGS or Stan fit named model
. The model may contain a variable named
b[i,v]
(in the JAGS or Stan language) with dimension i
in 1:100
and
dimension v
in 1:3
. However, the default format for draws returned from
JAGS or Stan in R will not reflect this indexing structure, instead
they will have multiple columns with names like "b[1,1]"
, "b[2,1]"
, etc.
spread_rvars
and gather_rvars
provide a straightforward
syntax to translate these columns back into properly-indexed rvar
s in two different
tidy data frame formats, optionally recovering dimension types (e.g. factor levels) as it does so.
spread_rvars
will spread names of variables in the model across the data frame as column names,
whereas gather_rvars
will gather variable names into a single column named ".variable"
and place
values of variables into a column named ".value"
. To use naming schemes from other packages
(such as broom
), consider passing
results through functions like to_broom_names()
or to_ggmcmc_names()
.
For example, spread_rvars(model, a[i], b[i,v])
might return a data frame with:
column "i"
: value in 1:5
column "v"
: value in 1:10
column "a"
: rvar
containing draws from "a[i]"
column "b"
: rvar
containing draws from "b[i,v]"
gather_rvars(model, a[i], b[i,v])
on the same model would return a data frame with:
column "i"
: value in 1:5
column "v"
: value in 1:10
, or NA
on rows where ".variable"
is "a"
.
column ".variable"
: value in c("a", "b")
.
column ".value"
: rvar
containing draws from "a[i]"
(when ".variable"
is "a"
)
or "b[i,v]"
(when ".variable"
is "b"
)
spread_rvars
and gather_rvars
can use type information
applied to the model
object by recover_types()
to convert columns
back into their original types. This is particularly helpful if some of the dimensions in
your model were originally factors. For example, if the v
dimension
in the original data frame data
was a factor with levels c("a","b","c")
,
then we could use recover_types
before spread_rvars
:
model %>% recover_types(data) spread_rvars(model, b[i,v])
Which would return the same data frame as above, except the "v"
column
would be a value in c("a","b","c")
instead of 1:3
.
For variables that do not share the same subscripts (or share
some but not all subscripts), we can supply their specifications separately.
For example, if we have a variable d[i]
with the same i
subscript
as b[i,v]
, and a variable x
with no subscripts, we could do this:
spread_rvars(model, x, d[i], b[i,v])
Which is roughly equivalent to this:
spread_rvars(model, x) %>% inner_join(spread_rvars(model, d[i])) %>% inner_join(spread_rvars(model, b[i,v]))
Similarly, this:
gather_rvars(model, x, d[i], b[i,v])
Is roughly equivalent to this:
bind_rows( gather_rvars(model, x), gather_rvars(model, d[i]), gather_rvars(model, b[i,v]) )
The c
and cbind
functions can be used to combine multiple variable names that have
the same dimensions. For example, if we have several variables with the same
subscripts i
and v
, we could do either of these:
spread_rvars(model, c(w, x, y, z)[i,v])
spread_rvars(model, cbind(w, x, y, z)[i,v]) # equivalent
Each of which is roughly equivalent to this:
spread_rvars(model, w[i,v], x[i,v], y[i,v], z[i,v])
Besides being more compact, the c()
-style syntax is currently also slightly
faster (though that may change).
Dimensions can be left nested in the resulting rvar
objects by leaving their names
blank; e.g. spread_rvars(model, b[i,])
will place the first index (i
) into
rows of the data frame but leave the second index nested in the b
column
(see Examples below).
A data frame.
Matthew Kay
spread_draws()
, recover_types()
, compose_data()
. See also
posterior::rvar()
and posterior::as_draws_rvars()
, the functions that power
spread_rvars
and gather_rvars
.
library(dplyr)
data(RankCorr, package = "ggdist")
RankCorr %>%
spread_rvars(b[i, j])
# leaving an index out nests the index in the column containing the rvar
RankCorr %>%
spread_rvars(b[i, ])
RankCorr %>%
spread_rvars(b[i, j], tau[i], u_tau[i])
# gather_rvars places variables and values in a longer format data frame
RankCorr %>%
gather_rvars(b[i, j], tau[i], typical_r)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.