msprime_genome: Simulate an admixture graph in msprime v1.x.

msprime_genomeR Documentation

Simulate an admixture graph in msprime v1.x.

Description

This function generates an msprime simulation script, and optionally executes it in python. Unlike msprime_sim, this function can simulate continuous sequence (not independent SNPs) and multiple chromosomes.

Usage

msprime_genome(
  graph,
  outpref = "msprime_sim",
  neff = 1000,
  ind_per_pop = 1,
  mutation_rate = 1.25e-08,
  time = 1000,
  fix_leaf = FALSE,
  nchr = 1,
  recomb_rate_chr = 2e-08,
  seq_length = 1000,
  admix_default = 0.5,
  run = FALSE,
  ghost_lineages = FALSE,
  shorten_admixed_leaves = FALSE
)

Arguments

graph

A graph as an igraph object or edge list with columns ’from’ and ’to’. If it is an edge list with a column ’weight’ (derived possibly from a fitted graph), the admixture weights will be used. Otherwise, all admixture edges will have a weight of 0.5.

outpref

A prefix of output files.

neff

Effective population size (in diploid individuals). If a scalar value, it will be constant across all populations. Alternatively, it can be a named vector with a different value for each population (e.g., c('R'=100, 'A'=50, 'B'=50)).

ind_per_pop

The number of diploid individuals to simulate for each population. If a scalar value, it will be constant across all populations. Alternatively, it can be a named vector with a different value for each population (e.g., c('A'=10, 'B'=20) to sample 10 and 20 diploid individuals from populations A and B, respectively).

mutation_rate

Mutation rate per site per generation. The default is 1.25e-8 per base pair per generation.

time

Either a scalar value (1000 generations by default) with the dates generated by pseudo_dates, or a named vector with dates for each graph node (in generations).

fix_leaf

A boolean value specifying if the dates of the leaf nodes will be fixed at time 0. If TRUE, all samples will be drawn at the end of the simulation (i.e., from “today”).

nchr

Number of chromosomes to simulate.

recomb_rate_chr

A float value specifying recombination rate along the chromosomes. The default is 2e-8 per base pair per generation.

seq_length

The sequence length of the chromosomes. If it is a scalar value, the sequence length will be constant for all chromosomes. Alternatively, it can be a vector with a length equal to the number of chromosomes (i.e., c(100,50) to simulate 2 chromosomes with the lengths of 100 and 50 base pairs).

admix_default

A float value specifying default admixture proportion for all admixture nodes. The default is 0.5. If another value between 0 and 1 is specified, admixture weights for each admixture event will be (value, 1-value).

run

If FALSE, the function will terminate after writing the msprime script. If TRUE, it will try to execute the msprime script with the default python installation. If you want to use some other python installation, you can set ⁠run = /my/python⁠.

ghost_lineages

A boolean value specifying whether ghost lineages will be allowed. If TRUE, admixture happens at the time points defined by the y-axis generated while plotting the graph by plot_graph. If FALSE (default), admixture occurs at the time of the previous split event.

shorten_admixed_leaves

If TRUE simulate the behavior of treemix where drift after admixture is not allowed

Value

The file name and path of the simulation script

Examples

results = qpgraph(example_f2_blocks, example_graph)
# Simulate 3 chromosomes whose lengths are 50, 100 and 100
msprime_genome(results$edges, nchr=3, seq_length=c(50, 100, 100))

uqrmaie1/admixtools documentation built on Nov. 3, 2024, 12:56 a.m.