The goal of tidypopgen
is to provide a tidy grammar of population genetics, facilitating
the manipulation and analysis of biallelic single nucleotide
polymorphisms (SNPs). tidypopgen
scales to very large genetic datasets by storing
genotypes on disk, and performing operations on them in chunks, without
ever loading all data in memory.
You can install the release version of tidypopgen
from CRAN:
install.packages("tidypopgen")
You can install the latest development version directly from r-universe (recommended):
install.packages('tidypopgen', repos = c('https://evolecolgroup.r-universe.dev',
'https://cloud.r-project.org'))
Alternatively, you can install tidypopgen
using devtools
(but you might need to set up your development environment,
which can be a bit more complex):
install.packages("devtools")
devtools::install_github("EvolEcolGroup/tidypopgen")
There are several vignettes designed to teach you how to use tidypopgen
.
A short introduction to the package is available in the
'introduction' vignette.
A more detailed and technical description of the grammar of population genetics,
explaining how to manipulate individuals and loci, is available in the
'grammar' vignette.
The 'quality control' vignette
illustrates the tidypopgen
functions that help
running a full QC of a dataset before analysis.
The 'population genetic analysis' vignette
provides a fully annotated example of how to
run various population genetics analyses with tidypopgen
.
We also provide a 'PLINK cheatsheet'
aimed at translating common tasks performed in PLINK into tidypopgen
commands.
There is also an article showing how manage aDNA sample that have been coded as pseudohaploids, including how to project ancient DNA data onto a PCA fitted to modern data and prepare data for admixtools: 'aDNA pseudohaploids' article.
Finally, tidypopgen
is fast and can handle large datasets easily. See a
'benchmark' article using the HGDP,
a dataset of over 1000 individuals typed for 650k SNPs. We can load the data, clean it,
run imputation, PCA and pairwise Fst among 51 populations in less than 20 seconds on a
powerful desktop (and less than a minute on a laptop).
If something does not work, check the issues on
GitHub to see whether
the problem has already been reported. If not, feel free to create an
new issue. Please make sure you have updated to the latest version of
tidypopgen
on r-universe/Github, as well as updating all other packages on your
system, and provide a reproducible
example
for the developers to investigate the problem. Ideally, try to create a minimalistic
dataset that reproduces the error, as it will be much easier (and thus faster!)
for the developers to track down the problem.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.