Calculates a 'largest consistent subset' given values and associated uncertainty information.
Vector of observations.
Vector of standard errors or standard uncertainties associated with
Significance level at which consistency is tested.
Subset identification method. Currently only 'enum' is supported.
Logical: Controls the level of reporting during the search.
LCS obtains the largest subset(s) of
x which pass a chi-squared
test for consistency, taking the uncertainties
u into account.
method controls the search method used. Method "enum" uses complete enumeration
of all subsets of size
n, starting at
n==length(x) and decreasing
until at least one consistent subset is found. No other method is currently supported; if
a different method is specified, LCS provides a warning and continues with "enum".
There may be more than on consistent subset of size n. If so, LCS returns all such
TRUE, in which case LCS prints a short warning
and returns the subset with smallest estimated uncertainty as estimated for the Graybill-Deal
weighted mean assuming large degrees of freedom in
verbose controls the level of reporting. If
TRUE, LCS prints the progress of
The general idea of a Largest Consistent Subset as implemented here was suggested by Cox (2006), though at least one other related method has been suggested by Heydorn (2006). It has, however, been criticised as an estimator (Toman and Possolo (2009)) ; see Warning below.
If there is only one subset of maximum size, or if
simplify=TRUE, a vector of indices
x representing the largest consistent subset.
If there is more than one subset of maximum size and
simplify=FALSE, a matrix of indices
in which the rows contain the indices of each subset.
LCS methods are essentially equivalent to unsupervised outlier rejection. In general, this results in a possibly extreme low estimated variance for an arbitrarily small subset (in the limit of gross inconsistency, LCS will return subsets of size 1). The estimated uncertainty calculated for the Graybill-Deal weighted mean of the subset(s) does not generally take account of the subset selection process or the dispersion of the complete data set, so is not an estimate of sampling variance.
LCS is therefore not recommended for consensus value estimation. It is however, quite useful for identifying value/uncertainty outliers.
S. Ellison email@example.com
Cox, M. G. (2007) The evaluation of key comparison data: determining the largest consistent subset. Metrologia 44, 187-200 (2007)
Heydorn, K. (2006) The determination of an accepted reference value from proficiency data with stated uncertainties. Accred Qual Assur 10, 479-484 (2006)
Toman, B. and Possolo, A. (2009) Laboratory effects models for interlaboratory comparisons. Accred. Qual. Assur. 14, 553-563 (2009)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.