checkallmodels: Check all possible models for existence and identifiability

Description Usage Arguments Details Value References Examples

Description

This routine efficiently checks whether every possible model satisfies the conditions for the existence and identifiability of an extended maximum likelihood estimate. For identifiability it is only necessary to check the model containing all two-list effects. For existence, the approach set out in Chan, Silverman and Vincent (2019) is used. This uses the linear programming approach described in a more general and abstract context by Fienberg and Rinaldo (2012).

Usage

1
checkallmodels(zdat, nreport = 1024)

Arguments

zdat

Data matrix with t+1 columns. The first t columns, each corresponding to a particular list, are 0s and 1s defining the capture histories observed. The last column is the count of cases with that particular capture history. List names A, B, ... are constructed if not supplied. Where a capture history is not explicitly listed, it is assumed that it has zero count.

nreport

A message is printed for every nreport models checked. It gives the number of top level models to be checked altogether, and also the number of models found so far for which the estimate does not exist.

Details

If the extended maximum likelihood estimator exists for a particular model, then it will still exist if one or more overlapping pairs are removed from the model. This allows the search to be carried out efficiently, as follows:

1. Search over ‘top-level’ models which contain all overlapping pairs. The number of such models is 2^k, where k is the number of non-overlapping pairs, because there are 2^k possible choices of the set of non-overlapping pairs to include in the model.

2. For each top-level model, if the estimate exists there is no need to consider that model further. If the estimate does not exist, then a branch search is carried out over the tree structure obtained by successively leaving out overlapping pairs.

Value

The routine prints a message if the model with all two-list parameters is not identifiable. As set out above, it gives regular progress reports in the case where there are a large number of models to be checked.

If all models give estimates which exist, then a message is printed to that effect.

If there are models for which the estimate does not exist, the routine reports the number of such models and returns a matrix whose rows give the parameters included in them.

References

Chan, L., Silverman, B. W., and Vincent, K. (2019). Multiple Systems Estimation for Sparse Capture Data: Inferential Challenges when there are Non-Overlapping Lists. Available from https://arxiv.org/abs/1902.05156.

Fienberg, S. E. and Rinaldo, A. (2012). Maximum likelihood estimation in log-linear models. Ann. Statist. 40, 996-1023. Supplementary material: Technical report, Carnegie Mellon University. Available from http://www.stat.cmu.edu/~arinaldo/Fienberg_Rinaldo_Supplementary_Material.pdf.

Examples

1
2
3
4

SparseMSE documentation built on Dec. 26, 2019, 5:06 p.m.