Initialization of time series data as input for haplotype reconstruction

Share:

Description

This function initializes a genome-wide time series data set that can be used as input for haplotype-block reconstruction.

Usage

1
2
3
4
initialize_SNP_time_series(chr, pos, base.freq, lib.freqs, pop.ident,
  pop.generation, use.libs, minfreqchange = 0.2, minrepl = 3,
  max.minor.freq = 3/200, winsize = 5e+05, min.minor.freq = 0,
  min.lib.frac = 0.75, win.scale = "bp", pos.cM = NULL)

Arguments

chr

character vector specifying the chromosome name for each genome-wide SNP

pos

numeric vector specifying the chromosomal position for each genome-wide SNP

base.freq

numeric vector specifying the frequency of the minor allele polarized in experimental starting population for each genome-wide SNP

lib.freqs

matrix specifying the frequencies of all genome-wide SNPs (rows) for all different libraries (time points and replicates, columns).

pop.ident

numeric vector specifying the identity of each library in terms of replicate ID

pop.generation

numeric vector specifying the time point of the respective library

use.libs

logical vector specifying which libraries should be used for haplotype-block reconstruction. The choice taken here determines SNP filtering as parameters minfreqchange and minrepl depend on the choice of the data set here. For visualization of marker frequencies, however, the remaining libraries will also be available.

minfreqchange

numeric specifying the minimum frequency change required in 'minrepl' replicates required to include the SNP in the analysis

minrepl

numeric specifying the number of replicates, in which the 'minfreqchange' is required to include the SNP in the analysis

max.minor.freq

numeric specifying the maximum frequency of the minor allele (polarized in the experimental starting population) to be included in the analysis

winsize

numeric specifying the window size on which to perform the analysis

min.minor.freq

numeric specifying the minimum frequency of the minor allele (polarized in the experimental starting population) to be included in the analysis (default=0).

min.lib.frac

minimum fraction of non-NA values for a SNP across libraries (only using libraries specified in use.libs) (default=0.75).

win.scale

character string specifying which genome-wide distance measure is used for window definition. Options are "bp" (base pairs) or "cM" (centi Morgan). cM distances can only be used if gentic positions are provided in 'pos.cM' (default="Mb").

pos.cM

numeric vector corresponding to SNP positions in col.info with genetic positions in cM.

Details

The function takes as input genome-wide frequencies of SNPs polarized for the minor frequency allele in the experimnetal starting population for multiple time points and replicates. SNP positions are filtered for a maximum frequency in the experimental starting population and a minimum frequency change in at least one time point for a specified number of replicates. The initialized data is returned as a SNP_time_series object that is required as input for the function reconstruct_hb to reconstruct unknown haplotype-blocks from the experimental starting population.

Value

an object of the class SNP_time_series data

Author(s)

Susanne U. Franssen

References

Franssen, Barton & Schloetterer 2016, Reconstruction of haplotype-blocks selected during experimental evolution, MBE

See Also

ex_dat SNP_time_series

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.