tess3 | R Documentation |
tess3
is the main function of the tess3r
package. It runs
a graph-based nonnegative matrix factorization algorithm that includes
geographic data in the estimation of spatial population structure.
The function requires individual genotypes, geographic coordinates, and
it can be run for multiple values of the number of ancestral populations.
In addition, the function uses the estimates of ancestry coefficients to
compute an Fst statistic for each locus, and to return test significance
values for a null hypothesis of selective neutrality. See the references
for more details.
tess3(X, XProba = NULL, coord, K, ploidy, lambda = 1, rep = 1, W = NULL,
method = "projected.ls", max.iteration = 200, tolerance = 1e-05,
openMP.core.num = 1, Q.init = NULL, mask = 0, algo.copy = TRUE,
keep = "best", verbose = FALSE)
X |
a matrix of individual genotypes. This matrix must
have |
XProba |
A matrix which contains individual genotype likelihoods (probabilities) for each
locus. This matrix must contain |
coord |
a matrix of size |
K |
an integer or a vector of integers corresponding to the number of ancestral populations. |
ploidy |
an integer corresponding to ploidy of the studied organism. Haploids have ploidy = 1, diploids have ploidy = 2, etc. |
lambda |
a numeric value for the spatial regularization parameter. The default value lambda = 1 attributes equal weights to the loss function and to the penalty function. |
rep |
integer. The number of time the algorithm will be repeated for each value of
|
W |
a matrix which corresponds to the graph weightings.
If NULL, W is computed as
|
method |
a character string |
max.iteration |
the maximum number of iterations of the optimization algorithm. |
tolerance |
a numeric value corresponding to the stopping criteria of the optimization algorithm. |
openMP.core.num |
integer representing the number of cores used by the optimization algorithm. It requires that the openMP library is installed in your OS (default for macOS is no). |
Q.init |
a matrix for initial values of ancestry coefficients for the algorithm. The default value is a random matrix. |
mask |
If not |
algo.copy |
boolean. If TRUE data is copied to speed up the algorithm. |
keep |
If |
verbose |
If |
An object of class tess3 which corresponds to a list of length length(K)
.
Each element of this list has the following attributes
the number of ancestral populations
if keep = "best"
, the tess3Main
result
with the lowest value of the rmse
(loss) function. If keep = "all"
,
a list of tess3Main
results for each repetition
root mean squared error between the genotypic matrix XProba
and the
fitted matrix for each program repetition
cross-entropy between the genotypic matrix XProba
and the
fitted matrix for each program repetition
root square mean error between the masked values of genotypic matrix
XProba[masked]
and their fitted values for each repetition. If mask is FALSE, then NULL
.
cross-entropy between the masked values of genotypic matrix
XProba[masked]
and their fitted values for each repetition. If mask is FALSE, then NULL
.
Methods available for this class:
plot.tess3
summary.tess3
is.tess3
Gettess3res
qmatrix
pvalue
Caye, K., Deist, T. M., Martins, H., Michel, O., & François, O. (2016). TESS3: fast inference of spatial population structure and genome scans for selection. Molecular Ecology Resources, 16(2), 540-548. http://onlinelibrary.wiley.com/doi/10.1111/1755-0998.12471/full Caye, Kevin et al. (2016) Fast Inference of Individual Admixture Coefficients Using Geographic Data. bioRxiv doi:10.1101/080291. http://biorxiv.org/content/early/2016/10/12/080291 Caye, K., Jay, F., Michel, O., François, O. (2018). Fast inference of individual admixture coefficients using geographic data. The Annals of Applied Statistics, 12(1), 586-608.
tess3Main
, plot.tess3Q
,
barplot.tess3Q
library(tess3r)
# Arabidopsis thaliana data set
data(data.at)
genotype <- data.at$X
coordinates <- data.at$coord
# Running the tess3 function
tess3.obj <- tess3(X = genotype, coord = coordinates, K = 1:4,
method = "projected.ls",
ploidy = 1)
# Plot error
plot(tess3.obj, pch = 19, col = "blue",
xlab = "Number of ancestral populations",
ylab = "Cross-validation score")
# Retrieve the Q-matrix for K = 3 clusters
q.matrix <- qmatrix(tess3.obj, K = 3)
## STRUCTURE-like barplot for the Q-matrix
barplot(q.matrix, border = NA, space = 0,
xlab = "Individuals", ylab = "Ancestry proportions",
main = "Ancestry matrix") -> bp
axis(1, at = 1:nrow(q.matrix), labels = bp$order, las = 3, cex.axis = .4)
## Spatial interpolation of ancestry coefficient
my.colors <- c("tomato", "orange", "lightblue")
my.palette <- CreatePalette(my.colors, 9)
plot(q.matrix, coordinates, method = "map.max", interpol = FieldsKrigModel(10),
main = "Ancestry coefficients",
xlab = "Longitude", ylab = "Latitude",
resolution = c(500,500), cex = .4,
col.palette = my.palette)
## Genome scan p-values for K = 3
p.values <- pvalue(tess3.obj, K = 3)
hist(p.values, col = "lightblue")
## Manhatan plot
plot(p.values, main = "Manhattan plot",
xlab = "Locus id",
ylab = "-log10(P-values)",
cex = .3, col = "grey")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.