The normal tidystats workflow consists of running statistics functions such as
lm()
, saving the output into variables, and then using the add_stats()
function to add the statistics to a list. This works as long as tidystats
has built-in support for the statistics functions you ran. So what should you do
when this is not the case?
The first thing would be to let me know that there's a function you would like tidystats to support. There are various ways to contact me. You can go to the Github page and create an issue. This is the preferred method because it is easy to paste code snippets and to ask follow-up questions. For alternative ways to contact me, see the tidystats website.
Of course, it's not always possible to wait for me to add support for the
function to tidystats. Nor is it always possible for me to add support for the
function. This can happen when the statistic you want to report is not
identifiable as belonging to a particular type of analysis (for example, the
result of confint()
returns a matrix, which does not contain any information
about it being a matrix containing confidence intervals).
For these reasons, it is useful to know that there are two possible solutions.
The first solution is that I can still add support for functions that return
objects without sufficient information. You can tell add_stats()
what kind of
object it is using the class
argument. This only works if I explicitly coded
support for a specific class. You can see which classes are supported in the
help document of add_stats()
(?add_stats
).
The second solution is to manually extract the statistics yourself and create
an object to add to add_stats()
. I've created a few helper functions to help
you do this: custom_stats()
and custom_stat()
. These two functions work
together to help you store the statistics in a format needed for tidystats to
function.
custom_stats()
has two arguments: method
and statistics
. The method should
contain a description of the type of method you used. The statistics
argument
requires a vector of statistics created with the custom_stat()
function.
The custom_stat()
function serves to create a statistic, along with the
necessary information to report the statistic. At a minimum, it requires
specifying the name
and value
of the statistic. Optionally you can also
specify a symbol and a subscript so that the text editor add-ins can correctly
report the statistic. Finally, it's also possible that the statistic is a ranged
statistic, i.e., it has a lower and upper bound. These statistics require that
you specify the type of interval
(e.g., "CI", "HDI"), level
(e.g., .95),
lower
and upper
bound.
Below I show a few examples of adding custom statistics.
class
argumentSay we want to calculate the confidence intervals for several parameters in a
linear model, using the confint()
function.
# Run a linear model fit <- lm(100 / mpg ~ disp, data = mtcars) # Compute the confidence intervals fit_confint <- confint(fit) # Create an empty list statistics <- list() # Add linear model and confidence intervals to the list statistics <- statistics %>% add_stats(fit) %>% add_stats(fit_confint)
Unfortunately, we get an error:
Error in UseMethod("tidy_stats") : no applicable method for 'tidy_stats' applied to an object of class "c('matrix', 'array', 'double', 'numeric')"
That's because confint()
return a standard matrix, rather than an object
specific to the confint()
function. You can check this yourself by running
the class()
function on the output of confint()
(e.g., class(fit_confint)
). You'll see that it simply says
"matrix" "array"
. That's not enough for tidystats to work with. Ideally it
would say something like "confint"
so that tidystats knows what it is working
with and extract the statistics.
Thankfully, we can still add statistics from confint()
to a list via
add_stats()
, using the class
argument. We can specify that the statistics
are from the confint()
function by saying class = "confint"
.
statistics <- statistics %>% add_stats(fit) %>% add_stats(fit_confint, class = "confint")
We don't get an error this time, indicating that it worked.
custom_stats()
Say you want to calculate a Bayes Factor using the BIC approach (Wagenmakers, 2007). An example of this approach can be found here; which I'll repeat down below.
# Set seed for reproducibility set.seed(14) # Simulate some data intercept_data <- data.frame(score = scale(rnorm(40), center = 0.72)) # Run two models and calculate the BIC full_lm <- lm(score ~ 1, intercept_data) null_lm <- lm(score ~ 0, intercept_data) BF_BIC <- exp((BIC(null_lm) - BIC(full_lm)) / 2)
The Bayes Factor is r round(BF_BIC, 2)
. Now, how do you add this value to a
tidystats list? If we try it the standard way, we'll see that it fails.
# Load the tidystats package library(tidystats) # Create an empty list statistics <- list() # Add BIC to the list using add_stats() statistics <- add_stats(statistics, BF_BIC)
This produces an error message that says:
Error in UseMethod("tidy_stats") : no applicable method for 'tidy_stats' applied to an object of class "c('double', 'numeric')"
It's because BF_BIC
is simply a number and not the output of a statistics
function, so tidystats doesn't know how to store this number. Let's fix this
using custom_stats()
and custom_stat()
.
# Create a list of custom statistics BIC <- custom_stats( method = "BIC", statistics = custom_stat( name = "BIC Bayes Factor", value = BF_BIC, symbol = "BF", subscript = "10" ) ) # Add the statistics to the list statistics <- add_stats(statistics, BIC)
Now we don't get an error. Thanks to custom_stats()
and custom_stat()
we
correctly structured the statistic so it can be added to the list via
add_stats()
.
tidystats works by taking the output of statistical tests, extracting the statistics, and reorganizing them into a particular structure. This works if 1) tidystats has built-in support for the function and 2) if the function used to run the statistical test returns an object that tidystats can use to identify the test.
If you want to use tidystats on a function that is not supported yet, please contact me to let me know that I should add support for it.
Alternatively, you can manually create a list of statistics and supply it to
the add_stats()
function using custom_stats()
and custom_stat()
.
The goal is to have tidystats support as many tests as possible, so that you rarely have to resort to this solution.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.