Infer log Likelihoods using simulated distributions of summary statistics
For each simulated distribution of summary statistics,
infer_logLs infers a probability density function, and the density of the observed values of the summary statistics is deduced. By default, inference of each density is performed by
infer_logL_by_Rmixmod, which fits a distribution of summary statistics using procedures from the
1 2 3 4 5 6 7 8 9 10 11 12 13
infer_logLs(object, stat.obs, logLname = Infusion.getOption("logLname"), verbose = list(most=interactive(), final=FALSE), method="infer_logL_by_Rmixmod", ...) infer_tailp(object, refDensity, stat.obs, tailNames=Infusion.getOption("tailNames"), verbose=interactive(), method=NULL,...) infer_logL_by_GLMM(EDF,stat.obs,logLname,verbose) infer_logL_by_Rmixmod(EDF,stat.obs,logLname,verbose) infer_logL_by_mclust(EDF,stat.obs,logLname,verbose) infer_logL_by_Hlscv.diag(EDF,stat.obs,logLname,verbose)
A list of simulated distributions (the return object of
An empirical distribution, with a required
Named numeric vector of observed values of summary statistics.
The name to be given to the log Likelihood in the return object, or the root of the latter name in case of conflict with other names in this object.
Names of “positives” and “negatives” in the binomial response for the inference of tail probabilities.
An object representing a reference density (such as an
A list as shown by the default, or simply a vector of booleans, indicating respectively
whether to display (1) some information about progress; (2) a final summary of the results after all elements of
A function for density estimation. See Description for the default behaviour and Details for the constraints on input and output of the function.
further arguments passed to or from other methods (currently not used).
By default, density estimation is based on
Rmixmod methods. Other available methods are not routinely used and not all of
Infusion features may work with them. The function
mixmodCluster is called, with arguments
Infusion.getOption("nbCluster") specifies a sequence of values, then several clusterings are computed and AIC is used to select among them.
infer_logL_by_Hlscv.diag are examples of the method that may be provided for density estimation. Other
methods may be provided with the same arguments. Their return value must include the element
logL, an estimate of the log-density of
stat.obs, and the element
isValid with values
TRUE (or 0/1). The standard format for the return value is
isValid is primarily intended to indicate whether the log likelihood of
stat.obs inferred by a given density estimation method was suitable input for inference of the likelihood surface.
isValid has two effects: to distinguish points for which isValid is FALSE in the plot produced by
plot.SLik; and more critically, to control the sampling of new parameter points within
refine so that points for which isValid is FALSE are less likely to be sampled.
Invalid values may for example indicate a likelihood estimated as zero (since log(0) is not suitable input), or (for density estimation methods which may infer erroneously large values when extrapolating), whether
stat.obs is within the convex hull of the EDF. In user-defined
methods, invalid inferred logL should be replaced by some alternative low estimate, as all methods included in the package do.
The source code of
infer_logL_by_Hlscv.diag illustrates how to test whether
stat.obs is within the convex hull of the EDF, using functions
isPointInCHull (exported from the
infer_logL_by_GLMM fits a binned distribution of summary statistics using a Poisson GLMM with autocorrelated random effects, where the binning is based on a tesselation of a volume containing the whole simulated distribution. Limited experimentations so far suggest that the mixture models methods are fast and appropriate (
Rmixmod, being a bit faster, is the default method); that the kernel smoothing method is more erratic and moreover requires additional input from the user, hence is not really applicable, for distributions in dimension d= 4 or above; and that the GLMM method is a very good density estimator for d=2 but will challenge one's patience for d=3 and further challenge the computer's memory for d=4.
infer_logLs, a data frame containing parameter values and their log likelihoods, and additional information such as attributes providing information about the parameter names and statistics names (not detailed here). These attributes are essential for further inferences.
See Details for the required value of the
methods called by
See step (3) of the workflow in the Example on the main
Infusion documentation page.