metaVaR is R package developed for to perform robust population genomic analysis based on variants generated directly from metagenomic data without reference genomes or transcriptomes. Its utilization is based on the reference-free variant caller DiscoSnp++ ran on multisamples metagenomic reads. metaVaR clusters the metavariants into species called metavariant species or MVS. metaVaR allows to estimate genomic differentiation through F-statistics and to identify loci under natural selection.
metaVaR is a R package that can be installed directly from github as follow:
install_github("madoui/metaVaR")
Two toysets can be used with the package, the "6bac" dataset and the "MS5" dataset. The first dataset was created from simulated metagenomic data from bacterial genome admixture. The second toyset was created from real metagenomic data. Here is an example to run metaVaR on the "MS5" dataset.
library(metaVaR) # espilon values to test for dbscan e = c(3,5) # minimum points values to test for dbscan p = c(5,10) # Test all dbscan parameter couples MVC = tryParam (e, p, MS5$cov) # Select maximum weigthed independant sets MWIS = getMWIS (MVC) # Filter loci to generate MVS MVS = mvc2mvs(MWIS, freq = MS5$freq)
The results consists in a list of objects of class mvs that contains the allele frequency matrix, global and pairwise F_{ST}, LK and associated corrected p-value.
You first need to run DiscoSnp++ on multisample metagenomic data without references. Then, use the DiscoSnp++ VCF output to generate two matices: one depth of coverage matrix for bialleic loci and one allele frequency matrix. This task can be performed using the perl script metavarFilter.pl designed to parse the DiscoSnp++ output.
perl metavarFilter.pl -i metavariant.vcf -p prefix [-options...]
# run metaVaR e = c(3,5) p = c(5,10) MVC_list = tryParam (e, p, MS5$cov) # store MVC writeMvcList (MVC_list, "output_dir") # import MVC MVC_list = readMvcList ("output_dir") MWIS_list = getMWIS (MVC_list) MVS_list = mvc2mvs(MWIS_list, freq = MS5$freq) # store MVS writeMvsList (MVS_list, "output_dir") # read MVS MVS_list = readMvsList ("output_dir")
There are currently four types of plots that ca be produced on a MVS object. Here an example on the MVS "6_10_1".
# Plot the MVS distribution of allele frequency plotMvs (MVS$`5_5_1`, type = "freq") # Plot the MVS loci depth of coverage plotMvs (MVS$`5_5_1`, type = "cov") # Plot the pairwise $F_{ST}$ matrix plotMvs (MVS$`5_5_1`, type = "heatFst") # Plot the LK distribution plotMvs (MVS$`5_5_1`, type = "LK")
metaVaR: introducing metavariant species models for reference-free metagenomic-based population genomics. Romuald Laso-Jadart, Christophe Ambroise, Pierre Peterlongo, Mohammed-Amin Madoui. (2020) bioRxiv 2020.01.30.924381; doi: https://doi.org/10.1101/2020.01.30.924381
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.