Home

/

R-Forge

/

splits

/

subsDiag: Apply two types of diagnostics to clustered data

subsDiag: Apply two types of diagnostics to clustered data
In splits: SPecies' LImits by Threshold Statistics

Description Usage Arguments Details Value Author(s) References See Also Examples

Calculate diagnostics on the subspace identified by cluster analysis

1	subsDiag(X, ncl, clustMethod = "hc", nSim = 2000, sigLvl = 0.05, status = TRUE)

`X`	The Data.
`ncl`	The Structure of the data, obtained using a clustering statistic or some other hypothesis, e.g. `ddwtGap`.
`clustMethod`	Is the cluster definition obtained using hierarchical clustering "hc" or k-means "km". See details in `ddwtGap` or on the dedicated help-pages.
`nSim`	The number of simulations used for Monte Carlo estimates of significance.
`sigLvl`	The significance level for the chi-squared testing whether observations are significantly, or otherwise, influential on the structure of the data.
`status`	Report the status of the functions?

Model diagnostics assess the validity of particular assumptions. Application of the model diagnostics requires at least two individuals within each well-separated group; the cluster identification algorithms can identify isolated individuals as whole groups. Depending upon the circumstances, it might be reasonable to consider such individuals suspicious. The diagnostics aimed to identify individuals that (a) were extreme in measurement and (b) affected significantly the definition of the data structure.

Brooks (1994) calculated the influence of each data point by jack-knifing, i.e. by comparing the dominant eigenvalues of the data with and without a focal observation. A large difference in dominant eigenvalues implies that the focal observation exerts large influence in the sample, whose significance can be assessed using Monte Carlo estimates. If variables are not normally distributed , reference data sets can be generated using singular-value decomposition.

Fung (1999) devised a method to identify extreme observations outside the expected range of a particular sample.

Note that data need not be questionable or unusual to exert large influence.

A list containing:

`both`	The index of observations that are BOTH infleuntial and extreme.
`influence`	The index of infleuntial observations.
`distance`	The index of extreme observations.

Thomas H.G. Ezard tomezard [at] gmail [dot] com

Brooks, S. P. 1994. Diagnostics for Principal Components: Influence Functions as Diagnostic Tools. The Statistician 43:483-494. Ezard, T.H.G., Pearson, P.N. & Purvis, A. 2010. Algorithmic Approaches to Delimit Species in Multidimensional Morphospace. BMC Evol. Biol. 10: 175, doi:10.1186/1471-2148-10-175. Fung, W.-K. 1999. Outlier Diagnostics in Several Multivariate Samples. The Statistician 48:73-84.

dimReduct, ddwtGap

1
2
3

##following the example in ddwtGap ....
data(iris)
subsDiag(as.matrix(iris[,1:4]), 3)

splits documentation built on July 16, 2021, 3 p.m.

splits index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

splits
SPecies' LImits by Threshold Statistics

subsDiag: Apply two types of diagnostics to clustered data
In splits: SPecies' LImits by Threshold Statistics

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to subsDiag in splits...

R Package Documentation

Browse R Packages

We want your feedback!

splits SPecies' LImits by Threshold Statistics

subsDiag: Apply two types of diagnostics to clustered data In splits: SPecies' LImits by Threshold Statistics

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to subsDiag in splits...

R Package Documentation

Browse R Packages

We want your feedback!

splits
SPecies' LImits by Threshold Statistics

subsDiag: Apply two types of diagnostics to clustered data
In splits: SPecies' LImits by Threshold Statistics