knitr::opts_chunk$set(tidy = TRUE, fig.align = 'center', fig.width = 8, fig.height = 6)
In this vignette we use starmie
to perform visualisation and inference
on a sequence of ADMIXTURE models fit on the sample data
supplied by the creators of ADMIXTURE.
To begin we assume ADMIXTURE has been run on an example PED/BED file to analyse and there is a directory containing the resulting .P, .Q and logging files (if you want to perform inference).
To read a single ADMIXTURE run first supply the names of .P and .Q file and save
it as an admix object using the loadAdmixture
function.
library(starmie) # get Q and P files for K = 3 p3 <- system.file("extdata/hapmap3_files", "hapmap3.3.P", package = "starmie") q3 <- system.file("extdata/hapmap3_files", "hapmap3.3.Q", package = "starmie") k3_admix <- loadAdmixture(q3, p3) k3_admix
Optionally, if you would like to look at model fit statstics you can read in a log file as well. Now the admix object contains the estimated log-likelihood and cross-validation error.
k3_log <- system.file("extdata/hapmap3_files", "log3.out", package = "starmie") k3_admix <- loadAdmixture(q3, p3, k3_log) k3_admix
In general an admix
object consists of the following elements about a single
ADMIXTURE run:
list_names <- names(k3_admix) list_description <- c("K parameter supplied to ADMIXTURE", "Number of samples", "Number of markers", "Individual ancestral probability of membership to cluster", "Estimated ancestral allele frequencies for each cluster", "Model fit statistics") knitr::kable(data.frame(attributes = list_names, description = list_description))
The plotBar function works on admix
objects and produces a facetted barplot
in the same manner as for struct
objects.
plotBar(k3_admix)
A regular barplot can be constructed by setting facet = FALSE
.
plotBar(k3_admix, facet = FALSE)
admixList
objectIf you have tried running ADMIXTURE with many different values of $K$ then you
can use an admixList
object to maninpulate them. To construct an admixList
object use loadAdmixture
for each pair of Q and P files and then pass the
results to the admixList
constructor function.
For example we have run ADMIXTURE on the sample data set with $K = 5$ using five-fold cross-validation.
admix_multi <- exampleAdmixture() admix_multi
You can also plot multiple admix
objects to see how cluster memberships change
for different values of $K$.
plotMultiK(admix_multi)
The bestK
method can be used to determine which value of K
best explains
the observed data using an elbow plot on the cross-validation error.
bestK(admix_multi)
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.