This package is devoted to the multivariate analysis of genetic markers
data. These data can be codominant markers (e.g. microsatellites) or
presence/absence data (e.g. AFLP), and have any level of ploidy. 'adegenet'
defines three formal (S4) classes:
- genind: a class for data of individuals ("genind" stands for genotypes-individuals).
- genpop: a class for data of groups of individuals ("genpop" stands for genotypes-populations)
- genlight: a class for genome-wide SNP data
For more information about these classes, type "class ? genind", "class ?
genpop", or "?genlight".
Essential functionalities of the package are presented througout 4
tutorials, accessible using
basics: introduction to the package.
analysis of spatial genetic patterns.
dapc: population structure
and group assignment using DAPC.
genomics: introduction to the
class genlight for the handling and analysis of genome-wide
Note: In older versions of adegenet, these tutorials were avilable as
vignettes, accessible through the function
Important functions are also summarized below.
=== IMPORTING DATA ===
= TO GENIND OBJECTS =
data to genind object from the following softwares:
- STRUCTURE: see
- GENETIX: see
- FSTAT: see
- Genepop: see
To import data from any of these formats, you can also use the general function
In addition, it can extract polymorphic sites from nucleotide and amino-acid
- DNA files: use
read.dna from the ape
package, and then extract SNPs from DNA alignments using
- protein sequences alignments: polymorphic sites can be extracted from
protein sequences alignments in
alignment format (package
as.alignment) using the function
fasta2DNAbin allows for reading fasta files into
DNAbin object with minimum RAM requirements.
It is also possible to read genotypes coded by character strings from a
data.frame in which genotypes are in rows, markers in columns. For this, use
df2genind. Note that
df2genind can be used for
any level of ploidy.
= TO GENLIGHT OBJECTS =
SNP data can be read from the following formats:
- PLINK: see function
- .snp (adegenet's own format): see function
SNP can also be extracted from aligned DNA sequences with the fasta format,
=== EXPORTING DATA ===
adegenet exports data from
Genotypes can also be recoded from a genind object into a
data.frame of character strings, using any separator between alleles. This
covers formats from many softwares like GENETIX or STRUCTURE. For this, see
Also note that the
pegas package imports genind objects
using the function
=== MANIPULATING DATA ===
Several functions allow one to manipulate genind or genpop objects
genind2genpop: convert a genind object to a
seploc: creates one object per
marker; for genlight objects, creates blocks of SNPs.
seppop: creates one object per population
tab: access the allele data (counts or frequencies) of an object
(genind and genpop)
- x[i,j]: create a new object keeping only genotypes (or populations) indexed by 'i' and the alleles indexed by 'j'.
a table of allelic frequencies from a genpop object.
repool merges genoptypes from different gene pools into one
single genind object.
propTyped returns the
proportion of available (typed) data, by individual, population, and/or
selPopSize subsets data, retaining only genotypes
from a population whose sample size is above a given level.
pop sets the population of a set of genotypes.
=== ANALYZING DATA ===
Several functions allow to use usual, and less usual analyses:
HWE.test.genind: performs HWE test for all
populations and loci combinations
dist.genpop: computes 5
genetic distances among populations.
implementation of the Monmonier algorithm, used to seek genetic boundaries
among individuals or populations. Optimized boundaries can be obtained using
optimize.monmonier. Object of the class
monmonier can be
plotted and printed using the corresponding methods.
spca: implements Jombart et al. (2008) spatial Principal
global.rtest: implements Jombart et
al. (2008) test for global spatial structures
local.rtest: implements Jombart et al. (2008) test for local
propShared: computes the proportion of
shared alleles in a set of genotypes (i.e. from a genind object)
propTyped: function to investigate missing data in several ways
scaleGen: generic method to scale genind or
genpop before a principal component analysis
Hs: computes the average expected heterozygosity by population
in a genpop. Classically Used as a measure of genetic
the Discriminant Analysis of Principal Component (DAPC, Jombart et al.,
seqTrack: implements the SeqTrack algorithm for
recontructing transmission trees of pathogens (Jombart et al., 2010) .
glPca: implements PCA for genlight objects.
gengraph: implements some simple graph-based clustering using
genetic data. -
visualize the distribution of SNPs on a genetic sequence and test their
adegenetServer: opens up a web interface for
some functionalities of the package (DAPC with cross validation and feature
=== GRAPHICS ===
colorplot: plots points with associated
values for up to three variables represented by colors using the RGB system;
useful for spatial mapping of principal components.
loadingplot: plots loadings of variables. Useful for
representing the contribution of alleles to a given principal component in a
scatter.dapc: scatterplots for DAPC
compoplot: plots membership probabilities from a
=== SIMULATING DATA ===
hybridization between two populations.
simulates genealogies of haplotypes, storing full genomes.
glSim: simulates simple genlight objects.
=== DATASETS ===
H3N2: Seasonal influenza (H3N2) HA
dapcIllus: Simulated data illustrating the
eHGDP: Extended HGDP-CEPH dataset.
microbov: Microsatellites genotypes of 15 cattle breeds.
nancycats: Microsatellites genotypes of 237 cats from 17
colonies of Nancy (France).
genotypes of 335 chamois (Rupicapra rupicapra) from the Bauges mountains
sim2pop: Simulated genotypes of two
spcaIllus: Simulated data
illustrating the sPCA.
For more information, visit the adegenet website using the function
Tutorials are available via the command
To cite adegenet, please use the reference given by
citation("adegenet") (or see references below).
Thibaut Jombart <firstname.lastname@example.org>
Developers: Zhian N. Kamvar <email@example.com>, Caitlin Collins <firstname.lastname@example.org>, Ismail Ahmed <email@example.com>, Federico Calboli, Tobias Erik Reiners, Peter Solymos, Anne Cori,
Contributed datasets from: Katayoun Moazami-Goudarzi, Denis Laloë, Dominique Pontier, Daniel Maillard, Francois Balloux.
Jombart T. (2008) adegenet: a R package for the multivariate
analysis of genetic markers Bioinformatics 24: 1403-1405. doi:
Jombart T. and Ahmed I. (2011) adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics. doi: 10.1093/bioinformatics/btr521
Jombart T, Devillard S and Balloux F (2010) Discriminant analysis of
principal components: a new method for the analysis of genetically
structured populations. BMC Genetics 11:94. doi:10.1186/1471-2156-11-94
Jombart T, Eggo R, Dodd P, Balloux F (2010) Reconstructing disease outbreaks
from genetic data: a graph approach. Heredity. doi:
Jombart, T., Devillard, S., Dufour, A.-B. and Pontier, D. (2008) Revealing
cryptic spatial patterns in genetic variability by a new multivariate
method. Heredity, 101, 92–103.
See adegenet website: http://adegenet.r-forge.r-project.org/
Please post your questions on 'the adegenet forum': firstname.lastname@example.org
adegenet is related to several packages, in particular:
ade4 for multivariate analysis
pegas for population
ape for phylogenetics and DNA data handling
seqinr for handling nucleic and proteic sequences
for R-based web interfaces
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.