badMIXTURE-package: Compare admixture estimates from STRUCTURE-like programs to...

Description Author(s) References

Description

badMIXTURE is a package to check whether a mixture is a good description of data. This is thought of and described in terms of genetic data, but it is appropriate for other types of mixture.

The key concept is that we can compare the similarity of a number of objects (individuals) to a number of reference points, here thought of as the mean of clusters of the individuals. It is not important whether these represent anything - they can even be random clusterings - although the better chosen they are, the more powerful the test is.

Given a mixture, we expect the similarities to fit the mixture solution. This is explicitly proven for some special cases in genetics, but is likely to hold for many similarities, for example, covariance between locations in PCA (Principal Components Analysis).

We therefore need "data" (the set of similarities to each of the clusters, which can be computed from similarities to all individuals), a "mixture solution" to test, and a record of which individuals make up which clusters.

The functions you're likely to need from badMIXTURE are first compareMixtureToData, which performs the calculations, and mixturePlot which displays the results. The example section for these functions contain several cases to get you started. See also ?arisim for information about the included data, and arisimsmall for a subset of this dataset that is easy to experiment on.

If you want to reorder, look at compareMixtureToDataDirect and the example in it.

Author(s)

Maintainer: Daniel Lawson dan.lawson@bristol.ac.uk

References

Falush, Van-Dorp & Lawson http://biorxiv.org/content/early/2016/07/28/066431


danjlawson/badMIXTURE documentation built on Sept. 27, 2019, 9:11 p.m.