View source: R/f_hyperEstimation.R
exploreHypers | R Documentation |
exploreHypers
finds hyperparameter estimates using a variety of
starting points to examine the consistency of the optimization procedure.
exploreHypers(
data,
theta_init,
squashed = TRUE,
zeroes = FALSE,
N_star = 1,
method = c("nlminb", "nlm", "bfgs"),
param_limit = 100,
max_pts = 20000,
std_errors = FALSE
)
data |
A data frame from |
theta_init |
A data frame of initial hyperparameter guesses with
columns ordered as:
|
squashed |
A scalar logical ( |
zeroes |
A scalar logical specifying if zero counts are included. |
N_star |
A positive scalar whole number value for the minimum count
size to be used for hyperparameter estimation. If zeroes are used, set
|
method |
A scalar string indicating which optimization procedure is to
be used. Choices are |
param_limit |
A scalar numeric value for the largest acceptable value
for the |
max_pts |
A scalar whole number for the largest number of data points allowed. Used to help prevent extremely long run times. |
std_errors |
A scalar logical indicating if standard errors should be returned for the hyperparameter estimates. |
The method
argument determines which optimization procedure
is used. All the options use functions from the stats
package:
"nlminb":
nlminb
"nlm":
nlm
"bfgs":
optim
(method = "BFGS")
Since this function runs multiple optimization procedures, it is
best to start with 5 or less initial starting points (rows in
theta_init
). If the function runs in a reasonable amount of time,
this number can be increased.
This function should not be used with very large data sets unless data squashing is used first since each optimization call will take a long time.
It is recommended to use N_star = 1
when practical. Data
squashing (see squashData
) can be used to reduce the number
of data points.
The converge column in the resulting data frame was determined by examining the convergence code of the chosen optimization method. In some instances, the code is somewhat ambiguous. The determination of converge was intended to be conservative (leaning towards FALSE when questionable). See the documentation for the chosen method for details about code.
Standard errors, if requested, are calculated using the observed Fisher information matrix as discussed in DuMouchel (1999).
A list including the data frame estimates
of hyperparameter
estimates corresponding to the initial guesses from theta_init
(plus
convergence results):
code: The convergence code returned by the chosen
optimization function (see nlminb
,
nlm
, and optim
for details).
converge: A logical indicating whether or not convergence was reached. See "Details" section for more information.
in_bounds: A logical indicating whether or not the
estimates were within the bounds of the parameter space (upper bound
for \alpha_1, \beta_1, \alpha_2, and \beta_2
was determined by
the param_limit
argument).
minimum: The negative log-likelihood value corresponding to the estimated optimal value of the hyperparameter.
Also returns the data frame std_errs
if standard errors are
requested.
Make sure to properly specify the squashed
,
zeroes
, and N_star
arguments for your data set, since these
will determine the appropriate likelihood function. Also, this function
will not filter out data points. For instance, if you use N_star = 2
you must filter out the ones and zeroes (if present) from data
prior
to using this function.
DuMouchel W (1999). "Bayesian Data Mining in Large Frequency Tables, With an Application to the FDA Spontaneous Reporting System." The American Statistician, 53(3), 177-190.
nlminb
, nlm
, and
optim
for optimization details
squashData
for data preparation
Other hyperparameter estimation functions:
autoHyper()
,
hyperEM()
data.table::setDTthreads(2) #only needed for CRAN checks
#Start with 2 or more guesses
theta_init <- data.frame(
alpha1 = c(0.5, 1),
beta1 = c(0.5, 1),
alpha2 = c(2, 3),
beta2 = c(2, 3),
p = c(0.1, 0.2)
)
data(caers)
proc <- processRaw(caers)
squashed <- squashData(proc, bin_size = 300, keep_pts = 10)
squashed <- squashData(squashed, count = 2, bin_size = 13, keep_pts = 10)
suppressWarnings(
exploreHypers(squashed, theta_init = theta_init)
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.