Site Frequency Spectrum
site.spectrum computes the (un)folded site frequency spectrum
of a set of aligned DNA sequences.
1 2 3
a set of DNA sequences (as an object of class
a logical specifying whether to compute the folded site
frequency spectrum (the default), or the unfolded spectrum if
a single integer value giving which sequence is
ancestral; ignored if
the colour of the barplot (red by default).
a character string for the title of the plot; a generic
title is given by default (use
further arguments passed to
Under the infinite sites model of mutation, mutations occur on distinct sites, so every segregating (polymorphic) site defines a partition of the n sequences (see Wakeley, 2009). The site frequency spectrum is a series of values where the ith element is the number of segregating sites defining a partition of i and n - i sequences. The unfolded version requires to define an ancestral state with an external (outgroup) sequence, so i varies between 1 and n - 1. If no ancestral state can be defined, the folded version is computed, so i varies between 1 and n/2 or (n - 1)/2, for n even or odd, respectively.
folded = TRUE, sites with more than two states are ignored
and a warning is returned giving how many were found.
folded = FALSE, sites with an ambiguous state at the
external sequence are ignored and a warning is returned giving how
many were found. Note that it is not checked if some sites have more
than two states.
site.spectrum returns an object of class
which is a vector of integers (some values may be equal to zero) with
"folded" (a logical value) indicating which
version of the spectrum has been computed.
Wakeley, J. (2009) Coalescent Theory: An Introduction. Greenwood Village, CO: Roberts and Company Publishers.
DNAbin for manipulation of DNA sequences in R,
1 2 3 4
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.