Description Usage Arguments Details Value Author(s) References
View source: R/functions-read-sync.R
Reads in SNP time series data from a file with .sync
format.
1 | sync_to_frequencies(file, base.pops, header, mincov = 15)
|
file |
the name of the ".sync" file where the data should be read from. Sync
files are specified in Kofler et al. (2011). Sync files contain 3 + n columns with;
col 1: chromosome (reference contig), col 2: position (in the reference contig),
col 3: reference allele, col >3: sync entries for allele frequencies for all populations
in the form A-count:T-count:C-count:G-count:N-count:deletion-count.
Sync files originally don't have a header but headers are accepted when specified
with |
base.pops |
logical vector with the same length as the number of libraries present in the sync file. Libraries indicated with TRUE will be used for identification on the two main alleles (minor and major allele). Allele frequencies of all libraries will subsequently be polarized for the minor allele in this specified subset. |
header |
logical value specifying whether a header is present in the provided sync file. |
mincov |
minimum coverage to calculate allele frequencies. If the sum of allele counts of the minor and major allele are below this threshold the respective frequency will be encoded as NA (default=15). |
Time series data from a file with sync
format are read in. The sync
format is specified in Kofler et al. 2011 (PoPoolation2: identifying differentiation
between populations using sequencing of pooled DNA samples (Pool-Seq)). Allele counts
are read in for each library and SNP and transformed to allele frequencies. Allele
frequencies are polarized for the minor and major allele of a specifies (sub-)set of
libraries, i.e. libraries of the experimentla founder population. Frequencies are
determined only based on the counts of the two most common alleles in the specified
base populations base.pops
.
Please note: This procedure does not substitute a proper SNP calling. Provided sync
files are expected only to contain positions of previously called SNPs and at least
two alleles should be present in the specified base populations.
a data.table with 6 plus N columns with; col 1: chr (chromosome), col 2: pos (position on respective chromosome), col 3: ref (reference allele), col 4: minallele (minor allele across all specified base populations), col 5: majallele (major allele across all specified base populations), col 6: weighted mean frequency of all specified base populations poloarlized for the minor allele, col >6: allele frequency of the minor allele for each library
Susanne U. Franssen
Franssen, Barton & Schloetterer 2016, Reconstruction of haplotype-blocks selected during experimental evolution, MBE
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.