size = 0.5
with linewidth = 0.5
in geom_hcintersect()
and geom_xribbon()
.aes_string()
with aes()
in examples (and internally).tidyverse
package.ssd_hp()
to be percent affected rather than percent protected.delta = 7
argument to ssd_plot_cdf()
.ssdtools version 1.0.0 is the first major release of ssdtools
with some important improvements and breaking changes.
An important change to the functionality of ssd_fit_dists()
was to switch from model fitting using fitdistrplus
to TMB
which has resulted in improved handling of censored data.
Although it was hoped that model fitting would be faster this is currently not the case.
As a result of the change the fitdists
objects returned by ssd_fit_dists()
from previous versions of ssdtools
are not compatible with the major release and should be regenerated.
As a result of an international collaboration British Columbia and Canada and Australia and New Zealand selected a set of recommended distributions for model averaging and settings when generating final guidelines.
The distributions are
> ssd_dists_bcanz()
[1] "gamma" "lgumbel" "llogis" "lnorm" "lnorm_lnorm" "weibull"
The ssd_fit_bcanz()
and ssd_hc_bcanz()
functions were added to the package to facilitate the fitting of these distributions and estimation of hazard concentrations using the recommended settings.
In the previous version of ssdtools
a distribution was considered to have converged if the following condition was met
1) stats::optim()
returns a code of 0 (indicating successful completion).
In the new version an additional two conditions must also be met
2) Bounded parameters are not at a boundary (this condition can be turned off by setting at_boundary_ok = TRUE
or the user can specify different boundary values - see below)
3) Standard errors are computable for all the parameter values (this condition can be turned off by setting computable = FALSE
)
Censoring can now be specified by providing a data set with one or more rows that have
It is currently not possible to fit distributions to data sets that have
Rows that have a zero or missing value for the left column and an infinite or missing value for the right column (fully censored) are uninformative and will result in an error.
For uncensored data, Akaike Weights are calculated using AICc (which corrects for small sample size). In the case of censored data, Akaike Weights are calculated using AIC (as the sample size cannot be estimated) but only if all the distributions have the same number of parameters (to ensure the weights are valid).
Weighting must be positive with values <= 1000.
Previously the density functions for the available distributions were exported as R functions to make them accessible to fitdistrplus
.
This meant that ssdtools
had to be loaded to fit distributions.
The density functions are now defined in C++ as TMB templates and are no longer exported.
The distribution, quantile and random generation functions are more generally useful and are still exported but are now prefixed by ssd_
to prevent clashes with existing functions in other packages.
Thus for example plnorm()
, qlnorm()
and rlnorm()
have been renamed ssd_plnorm()
, ssd_qlnorm()
and ssd_rlnorm()
.
The following distributions were added (or in the case of burrIII3
readded) to the new version
burrIII3
- burrIII three parameter distributioninvpareto
- inverse pareto (with bias correction in scale order statistic)lnorm_lnorm
log-normal/log-normal mixture distributionllogis_llogis
log-logistic/log-logistic mixture distributionThe following arguments were added to ssd_fit_dists()
rescale
(by default FALSE
) to specify whether to rescale concentrations values by dividing by the largest (finite) value. This alters the parameter estimates, which can help some distributions converge, but not the estimates of the hazard concentrations/protections.reweight
(by default FALSE
) to specify whether to reweight data points by dividing by the largest weight.at_boundary_ok
(by default FALSE
) to specifying whether a distribution with one or more parameters at a boundary has converged.min_pmix
(by default 0) to specify the boundary for the minimum proportion for a mixture distribution.range_shape1
(by default c(0.05, 20)
) to specify the lower and upper boundaries for the shape1 parameter of the burrIII3 distribution.range_shape2
(by default the same as range_shape2
) to specify the lower and upper boundaries for the shape2 parameter of the burrIII3 distribution.control
(by default an empty list) to pass a list of control parameters to stats::optim()
.It also worth noting that the default value of
computable
argument was switched from FALSE
to TRUE
to enforce stricter requirements on convergence (see above).The following were added to handle multiple distributions
ssd_dists()
to specify subsets of the available distributions.delta
argument (by default 7) to the subset()
generic to only keep those distributions within the specified AIC(c) difference of the best supported distribution.The function ssd_fit_burrlioz()
was added to approximate the behaviour of Burrlioz.
Hazard concentration estimation is performed by ssd_hc()
(which is wrapped by predict()
) and hazard protection estimation by ssd_hp()
.
By default confidence intervals are estimated by parametric bootstrapping.
To reduce the time required for bootstrapping, parallelization was implemented using the future package.
The following arguments were added to ssd_hc()
and ssd_hp()
delta
(by default 7) to only keep those distributions within the specified AIC difference of the best supported distribution.min_pboot
(by default 0.90) to specify minimum proportion of bootstrap samples that must successfully fit.parametric
(by default TRUE
) to allow non-parametric bootstrapping.control
(by default an empty list) to pass a list of control parameters to stats::optim()
.and the following columns were added to the output data frame
wt
to specify the Akaike weight.method
to indicate whether parametric or non-parametric bootstrap was used.nboot
to indicate how many bootstrap samples were used.pboot
to indicate the proportion of bootstrap samples which fitted.It also worth noting that the
dist
column was moved from the last to the first position in the output data frame.Confidence intervals cannot be estimated for interval censored data.
Confidence intervals cannot be estimated for unequally weighted data.
The pvalue
argument (by default FALSE
) was added to ssd_gof()
to specify whether to return p-values for the test statistics as opposed to the test statistics themselves.
There have also been some substantive changes to the plotting functionality.
Added following functions
ssd_plot_data()
to plot censored and uncensored data by calling geom_ssdpoint()
for the left and for the right column (alpha parameter values should be adjusted accordingly)geom_ssdsegment()
to allow plotting of the range of a censored data points using segments.scale_colour_ssd()
(and scale_color_ssd()
) to provide an 8 color-blind scale.Made the following changes to ssd_plot()
bounds
(by default c(left = 1, right = 1)
) argument specify how many orders of magnitude to extend the plot beyond the minimum and maximum (non-missing) values.linetype
(by default NULL
) argument to specify line type.linecolor
(by default NULL
) argument to specify line color.ylab
from "Percent of Species Affected" to "Species Affected".Renamed
- GeomSsd
to GeomSsdpoint
.
- StatSsd
to StatSsdpoint
Soft-deprecated
- geom_ssd()
for geom_ssdpoint()
.
- stat_ssd()
.
- ssd_plot_cf()
for fitdistrplus::descdist()
.
ssddata
The dataset boron_data
was renamed ccme_boron
and moved to the ssddata
R package together with the other CCME datasets.
The ssddata
package provides a suite of datasets for testing and comparing species sensitivity distribution fitting software.
Added
ssd_data()
to return original data for a fitdists
object.ssd_ecd_data()
to get empirical cumulative density for data.ssd_sort_data()
to sort data by empirical cumulative density.npars()
now orders by distribution name.Implemented the following generics for fitdists
objects
glance()
to get the model likelihoods, information-theoretic criteria etc.
augment()
to return original data set.logLik()
to return the log-likelihood.summary.fitdists()
to summarize.wt
(Akaike weight) column to predict()
, ssd_hc()
and ssd_hp()
ic
to predict()
, ssd_hc()
and ssd_hp()
because unused.ssd_fit_dists()
.actuar
package.comma_signif()
so that now rounds to 3 significant digits by default and only applies scales::comma()
to values >= 1000....
argument to comma_signif()
.rdist()
functions now use length of n if length(n) > 1
.slnorm()
to get starting values for 'dlnorm' distribution.rllog()
that was causing error.ssd_hc()
and predict()
where ci = TRUE
to explicit ssd_hc(ci = FALSE)
and predict(ci = FALSE)
.shape
and scale
arguments to llog()
with lshape
and lscale
.location
and scale
arguments to lgumbel()
with llocation
and lscale
.burrIII2
).ssd_hp()
to calculate hazard percent at specific concentrations.ssd_exposure()
to calculate proportion exposed based on distribution of concentrations.predict()
and added parallel argument.ssd_fit_dists()
now checks if standard errors computable.burrIII3
).sdist(x)
functionality to set starting values for distributions.ssd_plot_cdf()
to plot cumulative distribution function (equivalent to autoplot()
)nobs()
for censored data now returns a missing value.ssd_fit_dists()
distributions now ordered alphabetically.ssd_hc()
argument hc = 5L
for percent = 5L
.dllog()
etc for dllogis()
.ssd_cfplot()
for ssd_plot_cf()
.llog
distribution with small concentrations.Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.