library(nodeSub) library(ggplot2) knitr::opts_chunk$set(echo = TRUE) plot_phyDat <- function(phyDat_alignment) { vx <- as.data.frame(phyDat_alignment) num_entries <- nrow(vx) * ncol(vx) to_plot <- matrix(nrow = num_entries, ncol = 3) cnt <- 1 for (i in 1:nrow(vx)) { for (j in 1:ncol(vx)) { to_plot[cnt, ] <- c(i, j, vx[i, j]) cnt <- cnt + 1 } } colnames(to_plot) <- c("x", "y", "base") to_plot <- tibble::as_tibble(to_plot) to_plot$x <- as.numeric(to_plot$x) to_plot$y <- as.numeric(to_plot$y) ggplot(to_plot, aes(x = x, y = y, fill = base)) + geom_tile() }
NodeSub includes many different functions to generate alignments, this file serves to provide an overview of the different models. The standard alignment function is given by 'sim_normal', which is based on the alignment simulation functions in the package phangorn.
seq_length <- 30 sub_rate <- 1 / seq_length input_tree <- TreeSim::sim.bd.taxa(n = 10, numbsim = 1, lambda = 1, mu = 0.1, complete = TRUE)[[1]] normal_alignment <- sim_normal(input_tree, l = seq_length, rate = sub_rate) plot_phyDat(normal_alignment$alignment)
Then, there are two node substitution models available, the unlinked and the linked model. In the unlinked model, both daughter branches accumulate substitutions independently from each other during speciation. In the linked model, the substitutions in the daughter branches are conditional on each other, such that substitutions accumulated in one daughter, are not able to be accumulated in the other daughter. For both models we need to specify the node time (tau). For the linked model rates are specified slightly differently, with the substitution rate reflecting the rate at which one of the daughters accumulates a substitution, and the node_mut_rate_double reflecting the rate at which both daughters accumulate a (different) substitution.
unlinked_alignment <- sim_unlinked(input_tree, rate1 = sub_rate, rate2 = sub_rate, l = seq_length, node_time = 0.5) plot_phyDat(unlinked_alignment$alignment) linked_alignment <- sim_linked(input_tree, rate = sub_rate, node_mut_rate_double = sub_rate * sub_rate, node_time = 0.5, l = seq_length) plot_phyDat(linked_alignment$alignment)
The linked and unlinked alignment simulators use Markovian mathematics to calculate the expected number of substitutions, which yields the correct mutations along a branch, but which neglects any 'reverse' mutations (as these are masked). If the need arises to more explicitly simulate the mutational process, we have provided explicit functions for both normal and the unlinked model:
unlinked_explicit <- sim_unlinked_explicit(input_tree, rate1 = sub_rate, rate2 = sub_rate, l = seq_length, node_time = 0.5) plot_phyDat(unlinked_explicit$alignment) normal_explicit <- sim_normal_explicit(input_tree, l = seq_length, rate = sub_rate) plot_phyDat(normal_explicit$alignment)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.