site.spectrum: Site Frequency Spectrum

View source: R/site.spectrum.R

site.spectrumR Documentation

Site Frequency Spectrum

Description

site.spectrum computes the (un)folded site frequency spectrum of a set of aligned DNA sequences or SNPs.

Usage

site.spectrum(x, ...)
## S3 method for class 'DNAbin'
site.spectrum(x, folded = TRUE, outgroup = 1, ...)
## S3 method for class 'loci'
site.spectrum(x, folded = TRUE, ancestral = NULL, ...)
## S3 method for class 'spectrum'
plot(x, col = "red", main = NULL, ...)

Arguments

x

a set of DNA sequences (as an object of class "DNAbin"), or an object of class "spectrum".

folded

a logical specifying whether to compute the folded site frequency spectrum (the default), or the unfolded spectrum if folded = FALSE.

outgroup

a single integer value giving which sequence is ancestral; ignored if folded = TRUE.

ancestral

a vector of ancestral alleles (required if folded = FALSE), typically from an output of VCFloci.

col

the colour of the barplot (red by default).

main

a character string for the title of the plot; a generic title is given by default (use main = "" to have no title).

...

further arguments passed to barplot, or to other mehods.

Details

Under the infinite sites model of mutation, mutations occur on distinct sites, so every segregating (polymorphic) site defines a partition of the n sequences (see Wakeley, 2009). The site frequency spectrum is a series of values where the ith element is the number of segregating sites defining a partition of i and n - i sequences. The unfolded version requires to define an ancestral state with an external (outgroup) sequence, so i varies between 1 and n - 1. If no ancestral state can be defined, the folded version is computed, so i varies between 1 and n/2 or (n - 1)/2, for n even or odd, respectively.

If folded = TRUE, sites with more than two states are ignored and a warning is returned giving how many were found.

If folded = FALSE, sites with an ambiguous state at the external sequence are ignored and a warning is returned giving how many were found. Note that it is not checked if some sites have more than two states.

If x is an object of class "loci", the loci which are not biallelic (e.g., SNPs) are dropped with a warning.

Value

site.spectrum returns an object of class "spectrum" which is a vector of integers (some values may be equal to zero) with the attributes "sample.size" and "folded" (a logical value) indicating which version of the spectrum has been computed.

Author(s)

Emmanuel Paradis

References

Wakeley, J. (2009) Coalescent Theory: An Introduction. Greenwood Village, CO: Roberts and Company Publishers.

See Also

DNAbin for manipulation of DNA sequences in R, haplotype

Examples

require(ape)
data(woodmouse)
(sp <- site.spectrum(woodmouse))
plot(sp)

pegas documentation built on May 29, 2024, 2:27 a.m.