For each simulated distribution of summary statistics, `infer_logLs`

infers a probability density function, and the density of the observed values of the summary statistics is deduced. By default, inference of each density is performed by `infer_logL_by_Rmixmod`

, which fits a distribution of summary statistics using procedures from the `Rmixmod`

package.

1 2 3 4 5 6 7 8 9 10 11 12 13 | ```
infer_logLs(object, stat.obs,
logLname = Infusion.getOption("logLname"),
verbose = list(most=interactive(),
final=FALSE),
method="infer_logL_by_Rmixmod",
...)
infer_tailp(object, refDensity, stat.obs,
tailNames=Infusion.getOption("tailNames"),
verbose=interactive(), method=NULL,...)
infer_logL_by_GLMM(EDF,stat.obs,logLname,verbose)
infer_logL_by_Rmixmod(EDF,stat.obs,logLname,verbose)
infer_logL_by_mclust(EDF,stat.obs,logLname,verbose)
infer_logL_by_Hlscv.diag(EDF,stat.obs,logLname,verbose)
``` |

`object` |
A list of simulated distributions (the return object of |

`EDF` |
An empirical distribution, with a required |

`stat.obs` |
Named numeric vector of observed values of summary statistics. |

`logLname` |
The name to be given to the log Likelihood in the return object, or the root of the latter name in case of conflict with other names in this object. |

`tailNames` |
Names of “positives” and “negatives” in the binomial response for the inference of tail probabilities. |

`refDensity` |
An object representing a reference density (such as an |

`verbose` |
A list as shown by the default, or simply a vector of booleans, indicating respectively
whether to display (1) some information about progress; (2) a final summary of the results after all elements of |

`method` |
A function for density estimation. See Description for the default behaviour and Details for the constraints on input and output of the function. |

`...` |
further arguments passed to or from other methods (currently not used). |

By default, density estimation is based on `Rmixmod`

methods. Other available methods are not routinely used and not all of `Infusion`

features may work with them. The function `mixmodCluster`

is called, with arguments `nbCluster=Infusion.getOption("nbCluster")`

and `mixmodGaussianModel=Infusion.getOption("mixmodGaussianModel")`

. If `Infusion.getOption("nbCluster")`

specifies a sequence of values, then several clusterings are computed and AIC is used to select among them.

`infer_logL_by_GLMM`

, `infer_logL_by_Rmixmod`

, `infer_logL_by_mclust`

, and `infer_logL_by_Hlscv.diag`

are examples of the method that may be provided for density estimation. Other `method`

s may be provided with the same arguments. Their return value must include the element `logL`

, an estimate of the log-density of `stat.obs`

, and the element `isValid`

with values `FALSE`

/`TRUE`

(or 0/1). The standard format for the return value is `unlist(c(attr(EDF,"par"),logL,isValid=isValid))`

.

`isValid`

is primarily intended to indicate whether the log likelihood of `stat.obs`

inferred by a given density estimation method was suitable input for inference of the likelihood surface. `isValid`

has two effects: to distinguish points for which isValid is FALSE in the plot produced by `plot.SLik`

; and more critically, to control the sampling of new parameter points within `refine`

so that points for which isValid is FALSE are less likely to be sampled.

Invalid values may for example indicate a likelihood estimated as zero (since log(0) is not suitable input), or (for density estimation methods which may infer erroneously large values when extrapolating), whether `stat.obs`

is within the convex hull of the EDF. In user-defined `method`

s, invalid inferred logL should be replaced by some alternative low estimate, as all methods included in the package do.

The source code of `infer_logL_by_Hlscv.diag`

illustrates how to test whether `stat.obs`

is within the convex hull of the EDF, using functions `resetCHull`

and `isPointInCHull`

(exported from the `blackbox`

package).

`infer_logL_by_Rmixmod`

calls `mixmodCluster`

, `infer_logL_by_mclust`

calls `densityMclust`

,
`infer_logL_by_Hlscv.diag`

calls `kde`

, and `infer_logL_by_GLMM`

fits a binned distribution of summary statistics using a Poisson GLMM with autocorrelated random effects, where the binning is based on a tesselation of a volume containing the whole simulated distribution. Limited experimentations so far suggest that the mixture models methods are fast and appropriate (`Rmixmod`

, being a bit faster, is the default method); that the kernel smoothing method is more erratic and moreover requires additional input from the user, hence is not really applicable, for distributions in dimension *d*= 4 or above; and that the GLMM method is a very good density estimator for *d*=2 but will challenge one's patience for *d*=3 and further challenge the computer's memory for *d*=4.

For `infer_logLs`

, a data frame containing parameter values and their log likelihoods, and additional information such as attributes providing information about the parameter names and statistics names (not detailed here). These attributes are essential for further inferences.

See Details for the required value of the `method`

s called by `infer_logLs`

.

See step (3) of the workflow in the Example on the main `Infusion`

documentation page.

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.