find_graphs: Find well fitting admixture graphs
In uqrmaie1/admixtools: Inferring demographic history from genetic data

find_graphs

R Documentation

Find well fitting admixture graphs

Description

This function generates and evaluates admixture graphs in numgen iterations to find well fitting admixturegraphs.

Usage

find_graphs(
  data,
  numadmix = 0,
  outpop = NULL,
  stop_gen = 100,
  stop_gen2 = 15,
  stop_score = 0,
  stop_sec = NULL,
  initgraph = NULL,
  numgraphs = 10,
  mutfuns = namedList(spr_leaves, spr_all, swap_leaves, move_admixedge_once,
    flipadmix_random, place_root_random, mutate_n),
  opt_worst_residual = FALSE,
  plusminus_generations = 5,
  return_searchtree = FALSE,
  admix_constraints = NULL,
  event_constraints = NULL,
  reject_f4z = 0,
  max_admix = numadmix,
  verbose = TRUE,
  ...
)

Arguments

`data`	Input data in one of three forms: A 3d array of blocked f2 statistics, output of `f2_from_precomp` or `f2_from_geno` A directory which contains pre-computed f2-statistics The prefix of genotype files
`numadmix`	Number of admixture events within each graph. (Only relevant if `initgraph = NULL`)
`outpop`	Name of the outgroup population
`stop_gen`	Total number of generations after which to stop
`stop_gen2`	Number of generations without improvement after which to stop
`stop_score`	Stop once this score has been reached
`stop_sec`	Number of seconds after which to stop
`initgraph`	Graph to start with. If it is specified, `numadmix` and `outpop` will be inferred from this graph.
`numgraphs`	Number of graphs in each generation
`mutfuns`	Functions used to modify graphs. Defaults to the following: `spr_leaves`: Subtree prune and regraft leaves. Cuts a leaf node and attaches it to a random other edge in the graph. `spr_all`: Subtree prune and regraft. Cuts any edge and attaches the new orphan node to a random other edge in the graph, keeping the number of admixture nodes constant. `swap_leaves`: Swaps two leaf nodes. `move_admixedge_once`: Moves an admixture edge to a nearby location. `flipadmix_random`: Flips the direction of an admixture edge (if possible). `mutate_n`: Apply `n` of the mutation functions in this list to a graph (defaults to 2).
`opt_worst_residual`	Optimize for lowest worst residual instead of best score. `FALSE` by default, because the likelihood score is generally a better indicator of the quality of the model fit, and because optimizing for the lowest worst residual is slower (because f4-statistics need to be computed).
`plusminus_generations`	If the best score does not improve after `plusminus_generations` generations, another approach to improving the score will be attempted: A number of graphs with on additional admixture edge will be generated and evaluated. The resulting graph with the best score will be picked, and new graphs will be created by removing any one admixture edge (bringing the number back to what it was originally). The graph with the lowest score will then be selected. This often makes it possible to break out of local optima, but is slower than regular graph modifications. If the current number of admixture events is lower than `max_numadmix`, the last step (removing an admixture edge) will be skipped.
`return_searchtree`	Return the search tree in addition to the models. Output will be a list with three items: models, search tree, search tree as data frame
`admix_constraints`	A data frame with constraints on the number of admixture events for each population. See `satisfies_numadmix` As soon as one graph happens to satisfy these constraints, all subsequently generated graphs will be required to also satisfy them.
`event_constraints`	A data frame with constraints on the order of events in an admixture graph. See `satisfies_eventorder` As soon as one graph happens to satisfy these constraints, all subsequently generated graphs will be required to also satisfy them.
`reject_f4z`	If this is a number greater than zero, all f4-statistics with `abs(z) > reject_f4z` will be used to constrain the search space of admixture graphs: Any graphs in which f4-statistics greater than `reject_f4z` are expected to be zero will not be evaluated.
`max_admix`	Maximum number of admixture edges. By default, this number is equal to `numadmix`, or to the number of admixture edges in `initgraph`, so the number of admixture edges stays constant. Setting this to a higher number will lead to more admixture edges being added occasionally (see `plusminus_generations`). Graphs with additional admixture edges will only be accepted if they improve the score by 5% or more.
`verbose`	Print progress updates
`...`	Additional arguments passed to `qpgraph`

Value

A nested data frame with one model per line

Examples

## Not run: 
res = find_graphs(example_f2_blocks, numadmix = 2)
res %>% slice_min(score)

## End(Not run)
## Not run: 
# Start with a graph with 0 admixture events, increase up to 3, and stop after 10 generations of no improvement
pops = dimnames(example_f2_blocks)[[1]]
initgraph = random_admixturegraph(pops, 0, outpop = 'Chimp.REF')
res = find_graphs(example_f2_blocks, initgraph = initgraph, stop_gen2 = 10, max_admix = 3)
res %>% slice_min(score)

## End(Not run)

uqrmaie1/admixtools documentation built on July 16, 2025, 4:01 p.m.