We load the package and the data set from the individual-based simulations. To this data we fit both the LSN and OU models,
Comparing the models by the bootstrapping procedure is intensive. compare
simulates under each model, and then estimates both models on both simulated sets of data (for a total of 4 model fits). As only the LSN fit is computationally intensive, this takes about twice the time as the above fitting step of model B. We will need to repeat this many times to build up a bootstrap distribution. Consequently, it is convient to run this step in parallel over multiple processors or clusters of computers.
Fortunately R has a diverse array of parallel tools, Schmidberger et al., 2009) give a nice review. Here we set up the flexible snowfall
parallel envirnoment, which can run in serial, on a multicore chip, or over a cluster of machines.
Rather than build this choice into the earlywarning signals package, the user can choose whatever looping implementation they desire to compute replicates of the comparison. Our results are cached in the package, so anyone seeking to explore this example can run data(example_analysis)
instead. The actual analysis takes just one line:
Once this step is finished, the remaining analysis is just a matter of shuffling and plotting the data. We extract the likelihood ratios (deviances) from the null and test simulations with a simple function.
We can visualize this data as overlapping distributions
Or as beanplot
We can make the ROC curve by calculating the true and false positive rates from the overlap of these distributions using roc_data
. This gives us a data frame of threshold values for our statistic, and the corresponding false positive rate and true positive rate. We can simply plot those rates against each other to create the ROC curve.
We can also look at the bootstraps of the parameters. Another helper function will reformat this data from reps list. The fit column uses a two-letter code to indicate first what model was used to simulate the data, and then what model was fit to the data. For instance, AB means the data was simulated from model A (the null, OU model) but fit to B.
There are lots of options for visualizing this relatively high-dimensional data, which we can easily explore with a few commands from the ggplot2
package. For instance, we can look at average and range of parameters estimated in the bootstraps of each model:
The reps
list also contains the simulation data from each replicate
Repeating the analysis for other data sets used in the study. The analysis described above is essentially six steps, each corresponding to a single line of code.
For instance, we can run these six steps on the chemostat data with:
We highlighted two addtional functions format the parameter data and simulation results so that they can plotted or explored further.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.