knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

library(admixturegraph)
suppressPackageStartupMessages(library(dplyr, quietly = TRUE))
suppressPackageStartupMessages(library(ggplot2, quietly = TRUE))
suppressPackageStartupMessages(library(ggthemes, quietly = TRUE))
suppressPackageStartupMessages(library(knitr, quietly = TRUE))
suppressPackageStartupMessages(library(neldermead, quietly = TRUE))

Admixture Graph Manipulation and Fitting

The package provides functionality to analyse and test admixture graphs against the f statistics described in the paper Ancient Admixture in Human History, Patternson et al., Genetics, Vol. 192, 1065--1093, 2012.

The f statistics --- f2, f3, and f4 --- extract information about correlations between gene frequencies in different populations (or single diploid genome samples), which can be informative about patterns of gene flow between these populations in form of admixture events. If a graph is constructed as a hypothesis for the relationship between the populations, equations for the expected values of the f statistics can be extracted, as functions of edge lenghs --- representing genetic drift --- and admixture proportions.

This package provides functions for extracting these equations and for fitting them against computed f statistics. It does not currently provide functions for computing the f statistics --- for that we refer to the ADMIXTOOLS software package.

Example

Below is a quick example of how the package can be used. The example uses data from polar bears and brown bears with a black bear as outgroup and is taken from Genomic evidence of geographically widespread effect of gene flow from polar bears into brown bears.

The BLK sample is the black bear, the PB sample is a polar bear, and the rest are brown bears.

I have taken the $f$ statistics from Table 1 in the paper:

data(bears)
bears

The D column is the f4(W,X;Y,Z) statistic and the Z column is the $Z$-values obtained from a blocked jacknife (see Patterson et al. for details).

From the statistics we can see that the ABC bears (Adm, Bar and Chi) are closer related to the polar bears compared to the other brown bears. The paper explains this with gene flow from polar bears into the ABC bears and going further out from there, but we can also explain this by several waves of admixture from ancestral polar bears into brown bears:

leaves <- c("BLK", "PB",
            "Bar", "Chi1", "Chi2", "Adm1", "Adm2",
            "Denali", "Kenai", "Sweden") 
inner_nodes <- c("R", "PBBB",
                 "Adm", "Chi", "BC", "ABC",
                 "x", "y", "z",
                 "pb_a1", "pb_a2", "pb_a3", "pb_a4",
                 "bc_a1", "abc_a2", "x_a3", "y_a4")

edges <- parent_edges(c(edge("BLK", "R"),
                        edge("PB", "pb_a1"),
                        edge("pb_a1", "pb_a2"),
                        edge("pb_a2", "pb_a3"),
                        edge("pb_a3", "pb_a4"),
                        edge("pb_a4", "PBBB"),

                        edge("Chi1", "Chi"),
                        edge("Chi2", "Chi"),
                        edge("Chi", "BC"),
                        edge("Bar", "BC"),
                        edge("BC", "bc_a1"),

                        edge("Adm1", "Adm"),
                        edge("Adm2", "Adm"),

                        admixture_edge("bc_a1", "pb_a1", "ABC", "a"),
                        edge("Adm", "ABC"),

                        edge("ABC", "abc_a2"),
                        admixture_edge("abc_a2", "pb_a2", "x", "b"),

                        edge("Denali", "x"),
                        edge("x", "x_a3"),
                        admixture_edge("x_a3", "pb_a3", "y", "c"),

                        edge("Kenai", "y"),
                        edge("y", "y_a4"),                        
                        admixture_edge("y_a4", "pb_a4", "z", "d"),

                        edge("Sweden", "z"),

                        edge("z", "PBBB"),
                        edge("PBBB", "R")))

bears_graph <- agraph(leaves, inner_nodes, edges)
plot(bears_graph, show_admixture_labels = TRUE)

Fitting a graph to data

The graph makes predictions on how the f4 statistics should look. The graph parameters can be fit to observed statistics using the fit_graph function:

fit <- fit_graph(bears, bears_graph)
fit

You can get detailsabout the fit by calling the summary.agraph_fit function:

summary(fit)

You can make a plot of the fit against the data by calling the plot.agraph_fit function:

plot(fit)

The plot shows the observed f4 statistics with error bars (in black) plus the predicted values from the graph.

The result of this is a ggplot2 object that you can modify by adding ggplot2 commands in the usual way.

Read the vignette admixturegraph for more examples.



mailund/admixture_graph documentation built on May 21, 2019, 11:06 a.m.