fangs: Feature Allocation Neighborhood Greedy Search

View source: R/fangs.R

fangsR Documentation

Feature Allocation Neighborhood Greedy Search

Description

An implementation of the feature allocation greedy search algorithm is provided.

Usage

fangs(
  samples,
  nInit = 16,
  nSweet = 4,
  nIterations = 0,
  maxSeconds = 60,
  a = 1,
  nCores = 0,
  algorithm = "stochastic",
  quiet = FALSE
)

Arguments

samples

An object of class ‘list’ containing posterior samples from a feature allocation distribution. Each list element encodes one feature allocation as a binary matrix, with items in the rows and features in the columns.

nInit

The number of initial feature allocations to obtain using the alignment method. For each initial feature, a baseline feature allocation is uniformly selected from the list provided in samples. Samples are aligned to the baseline, proportions are computed for each matrix element, and the initial feature allocation is obtained by thresholding according to a/2.

nSweet

The number of feature allocations among nInit which are chosen (by lowest expected loss) to be optimized in the sweetening phase.

nIterations

The number of iterations (i.e., proposed changes) to consider per initial estimate in the stochastic sweetening phase, although the actual number may be less due to the maxSeconds argument. The default value is 0, which sets the number of iterations to the number of items times the number of columns.

maxSeconds

Stop the search and return the current best estimate once the elapsed time exceeds this value.

a

A numeric scalar for the cost parameter of generalized Hamming distance used in FARO loss. The other cost parameter, b, is equal to 2 - a.

nCores

The number of CPU cores to use, i.e., the number of simultaneous calculations at any given time. A value of zero indicates to use all cores on the system.

algorithm

A string indicating the algorithm to use; equal to “stochastic”, “deterministic”, or “draws”. The “stochastic” algorithm is recommended, although the “deterministic” algorithm may provide an improvement at the cost of time.

quiet

If TRUE, intermediate status reporting is suppressed. Otherwise details are provided, especially when algorithm="stochastic".

Value

A list with the following elements:

  • estimate - The feature allocation point estimate in binary matrix form.

  • expectedLoss - The estimated expected FARO loss of the point estimate.

  • iteration - The iteration number (out of nIterations) at which the point estimate was found while sweetening.

  • nIterations - The number of sweetening iterations performed.

  • secondsInitialization - The elapsed time in the initialization phrase.

  • secondsSweetening - The elapsed time in the sweetening phrase.

  • secondsTotal - The total elapsed time.

  • whichSweet - The proposal number (out of nSweet) from which the point estimate was found.

  • nInit - The original supplied value of nInit.

  • nSweet - The original supplied value of nSweet.

  • a - The original supplied value of a.

References

D. B. Dahl, D. J. Johnson, R. J. Andros (2023), Comparison and Bayesian Estimation of Feature Allocations, Journal of Computational and Graphical Statistics, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/10618600.2023.2204136")}.

Examples

# To reduce load on CRAN testing servers, limit the number of iterations.
data(samplesFA)
fangs(samplesFA, nIterations=100, nCores=2)


fangs documentation built on April 11, 2025, 5:51 p.m.