Coala uses a modular system based on R6 Classes for integrating summary statistics and coalescent simulators. This document contains instructions on adding both.
Summary statistics are derived from the sumstat_class
base class. They
primarily consist of a calculate
function that -- well -- calculates the
statistics value from the simulation results. A simple example is the
sumstat_seg_sites()
statistic:
library(R6) library(coala) stat_segsites_class <- R6Class("stat_segsites", inherit = sumstat_class, private = list(req_segsites = TRUE), public = list( calculate = function(segsites, trees, files, model, sim_task) segsites ) ) sumstat_seg_sites <- function(name = "seg_sites", transformation = identity) { stat_segsites_class$new(name, transformation) }
The calculate is called after the simulation with the following arguments:
Among the above mentioned parameters, model is always passed to the calculate
function. The other three input parameters are generated on demand and have to
be requested by the statistic. To do so,
set the private variables req_segsites
, req_trees
or req_files
,
respectively, to TRUE
. Arguments not requested can be present if they are
created for a different summary statistic, but will be NULL
in most cases.
In this example we only use the segregating sites, and hence it is the only argument request. All that the summary statistic does is to return the unmodified segregating sites.
Warning: I am currently only satisfied with the structure of the segregating
sites. The format of the trees
and the files
arguments might still change.
If the statistic has additional options that can be set on creation, overwrite
the initialize
function. Take, for example, a simplified version of
sumstat_file
:
stat_file_class <- R6Class("stat_file", inherit = sumstat_class, private = list(folder = NULL, req_files = TRUE), public = list( initialize = function(folder) { dir.create(folder, showWarnings = FALSE) private$folder <- folder super$initialize("file", identity) }, calculate = function(seg_sites, trees, files, model, sim_task) { file.copy(files, private$folder, overwrite = FALSE) file.path(private$folder, basename(files)) } ) )
This function requires only the folder
argument on initialization, which is
the folder into which the files are copied. In the initialize
function, the
folder is created and its name is stored in a private variable. Finally,
it calls the constructor of sumstat_class
via super$initialize
. This is
essential when defining your own constructors! See the ?sumstat_class
for
further details.
Adding support for new coalescent simulators is more difficult than adding summary statistics. If you are planning to do so, I highly recommend to open an issue in coala's bug tracker first, so that I can assist with the implementation.
The most important part is to create a simulate
function that the model and
the model parameters as arguments, conducts the simulation and parses the
output to create the segsites
and/or trees
argument for the summary
statistics. It should throw an error with stop()
if it is given a model
which is not supported.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.