calc_origin_of_expansion: Estimate the origin point of a population expansion using...

View source: R/sfs_functions.R

calc_origin_of_expansionR Documentation

Estimate the origin point of a population expansion using directionality.

Description

Calculates the origin of expansion from directionality indices from pairwise population comparisons according to Peter and Slatkin (2013).

Usage

calc_origin_of_expansion(
  x,
  facet,
  boots = 1000,
  projection = NULL,
  boot_par = FALSE,
  verbose = FALSE,
  update_bib = FALSE
)

Arguments

x

snpRdata object. A snpRdata from which to calculate SFS, directionality indices, and an expansion point-of-origin. SNP metadata columns named "ref" and "anc" containing the identity of the derived and ancestral alleles, respectively, must be present, as must columns named 'x' and 'y' containing WGS84 elipsoid scaled sampling locations for each sample in the sample metadata.

facet

character, default NULL. The sample metadata facet by which to group populations and calculate the expansion origin. Only one facet currently allowed, although it can be complex (e.g. 'fam.pop').

boots

numeric, default 1000. The number of bootstraps used to determine the variance of the directionality index for each pairwise comparison.

projection

numeric, default NULL. The number of gene copies to project to for each facet level. Should be a named numeric vector containing an entry for each facet level.

boot_par

numeric or FALSE, default FALSE. If a number, bootstraps will be processed in parallel using the supplied number of cores.

verbose

Logical, default FALSE. If TRUE, some progress updates will be reported.

update_bib

character or FALSE, default FALSE. If a file path to an existing .bib library or to a valid path for a new one, will update or create a .bib file including any new citations for methods used. Useful given that this function does not return a snpRdata object, so a citations cannot be used to fetch references.

Details

Essentially, the directionality index measures the difference in derived allele frequency between two populations to determine the directionality of population spread between the two. Since the "destination" population is sourced from but experienced more genetic drift than the "source" population, it should have relatively more high-frequency derived alleles after the removal of fixed ancestral alleles. See Peter and Slatkin (2013) for details.

This metric, calculated between multiple populations, can be used to estimate the origin point of a population expansion using a Time Difference Of Arrival approach by scaling allelic differences to geometric space. See Peter and Slatkin (2013) for details. Distances are calculated using the distGeo function assuming a WGS84 ellipsoid, and so provided coordinates must be in that format. The sampling location for each subfacet is derived using the geomean of each.

The TODA method requires an optimization procedure to estimate the point-of- origin. This is done using the optim function with the default parameters for maximization.

Value

A named list containing:

  • opt: A vector with the spatial/genetic distance linking coefficient 'v' as well as the 'x' and 'y' coordinates of the estimated origin of the range expansion.

  • 'pairwise_directionality': A data.frame containing the pairwise directionality estimates, coordinates, and the directionality variance for each pair of populations.

Author(s)

William Hemstrom

References

Peter, B. M., & Slatkin, M. (2013). Detecting range expansions from genetic data. Evolution, 67(11), 3274-3289.

Examples

## Not run: 
# Bootstrapping is slow, so not run.
# set ref and anc--ideally use an outgroup for this
dat <- calc_maf(stickSNPs)
snp.meta(dat)$ref <- paste0("A", get.snpR.stats(dat)$minor, "A") 
snp.meta(dat)$anc <- paste0("A", get.snpR.stats(dat)$major, "A")

# setup x and y coords
long_lat <- data.frame(SMR = c(44.365931, -121.140420), 
                       CLF = c(44.267718, -121.255805), 
                       OPL = c(44.485958, -121.298360), 
                       ASP = c(43.891693, -121.448360), 
                       UPD = c(43.891755, -121.451600), 
                       PAL = c(43.714114, -121.272797))
long_lat <- t(long_lat)
long_lat <- long_lat[match(sample.meta(dat)$pop, rownames(long_lat)),]
colnames(long_lat) <- c("y", "x")
sample.meta(dat) <- cbind(sample.meta(dat), long_lat)

projection <- summarize_facets(dat, facet)[[facet]]
projection <- floor(projection*.8)

# run the calculation
calc_origin_of_expansion(dat, "pop", boots = 100, projection = projection, 
                         boot_par = 6, verbose = TRUE)

## End(Not run)

hemstrow/snpR documentation built on Dec. 13, 2024, 4:27 a.m.