fast_fit: A fast version of graph fitting.
In mailund/admixture_graph: Admixture Graph Manipulation and Fitting

Description Usage Arguments Value See Also Examples

Given a table of observed f statistics and a graph, uses Nelder-Mead algorithm to find the graph parameters (edge lengths and admixture proportions) that minimize the value of cost_function, i. e. maximizes the likelihood of a graph with parameters given the observed data. Like fit_graph but dropping most of the analysis on the result. Intended for use in big iteration loops.

fast_fit(data, graph, point = list(rep(1e-05,
  length(extract_graph_parameters(graph)$admix_prop)), rep(1 - 1e-05,
  length(extract_graph_parameters(graph)$admix_prop))), Z.value = TRUE,
  concentration = calculate_concentration(data, Z.value),
  optimisation_options = NULL, parameters = extract_graph_parameters(graph),
  iteration_multiplier = 3)

`data`	The data table, must contain columns `W`, `X`, `Y`, `Z` for sample names and `D` for the observed f_4(W, X; Y, Z). May contain an optional column `Z.value` for the Z scores (the f statistics divided by the standard deviations).
`graph`	The admixture graph (an `agraph` object).
`point`	If the user wants to restrict the admixture proportions somehow, like to fix some of them. A list of two vectors: the lower and the upper bounds. As a default the bounds are just it little bit more than zero and less than one; this is because sometimes the infimum of the values of cost function is at a point of non-continuity, and zero and one have reasons to be problematic values in this respect.
`Z.value`	Whether we calculate the default concentration from Z scores (the default option `TRUE`) or just use the identity matrix.
`concentration`	The Cholesky decomposition of the inverted covariance matrix. Default matrix determined by the parameter `Z.value`.
`optimisation_options`	Options to the Nelder-Mead algorithm.
`parameters`	In case one wants to tweak something in the graph.
`iteration_multiplier`	Given to `mynonneg`.

A list containing only the essentials about the fit: graph is the graph input, best_error is the minimal value of cost_function, obtained when the admixture proportions are best_fit.

cost_function

agraph

calculate_concentration

optimset

fit_graph

# For example, let's fit the following two admixture graph to an example data on bears:

data(bears)
print(bears)

leaves <- c("BLK", "PB", "Bar", "Chi1", "Chi2", "Adm1", "Adm2", "Denali", "Kenai", "Sweden") 
inner_nodes <- c("R", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "M", "N")
edges <- parent_edges(c(edge("BLK", "R"),
                        edge("PB", "v"),
                        edge("Bar", "x"),
                        edge("Chi1", "y"),
                        edge("Chi2", "y"),
                        edge("Adm1", "z"),
                        edge("Adm2", "z"),
                        edge("Denali", "t"),
                        edge("Kenai", "s"),
                        edge("Sweden", "r"),
                        edge("q", "R"),
                        edge("r", "q"),
                        edge("s", "r"),
                        edge("t", "s"),
                        edge("u", "q"),
                        edge("v", "u"),
                        edge("w", "M"),
                        edge("x", "N"),
                        edge("y", "x"),
                        edge("z", "w"),
                        admixture_edge("M", "u", "t"),
                        admixture_edge("N", "v", "w")))
admixtures <- admixture_proportions(c(admix_props("M", "u", "t", "a"),
                                      admix_props("N", "v", "w", "b")))
bears_graph <- agraph(leaves, inner_nodes, edges, admixtures)
plot(bears_graph, show_admixture_labels = TRUE)

fit <- fast_fit(bears, bears_graph)
print(fit$best_error)

# The result is just the minimal value of the cost function and the values of admixture proportions
# where it's obtained, no deeper analysis of the fit.