README.md
In ellisztamas/simmiad: Simulations of Emmer wheat at the Ammiad kibbutz

simmiad

R package to simulate populations of wild Emmer wheat from the Kibbutz Ammiad

Introduction
Installation
Dependencies
How simulations work
Usage
Author and license information

An R package for simulating a population of wild Emmer wheat to ask whether the amount of spatial clustering of unique genotypes and the stability of that clustering through time can be explained by purely neutral forces. The idea is to simulate a population of plants evolving under seed dispersal and limited, random outcrossing only, then to sample plants along a transect in the same way that the real population is sampled.

Installation is easiest straight from GitHub using the package devtools from within R. If necessary, install this with

install.packages("devtools")

Then you can install with

devtools::install_github("ellisztamas/simmiad")

In most cases, simmiad uses base R functions only. One experimental feature uses mvtnorm to generate samples from a multivariate normal distribution, but this is probably not needed.

This simulates a population of plants at a given density in a square habitat, which is twice as wide/long as the transect
Population size is determined as the number of plants needed to fill the habitat given its area and population density.
Supply a one-dimensional vector of genotypes. This is copied up and down to create bands of identical genotypes perpendicular to the direction in which transects are samples.

In the next generation each plant has the chance to produce seeds. Offspring numbers are drawn from a multinomial distribution of size N, with probablity 1/N for each mother, where N is population size. Thus, each plant generates an average of one seed.
Each seed disperses in a random direction at a distance drawn from an exponential distribution.
Each seed has some probability of having been the product of an outcrossing event. If so, it is assigned a new unique genotype. If not, it is assumed to have been selfed and shares the genotype of its mother.
This is repeated for many generations

At the end of the simulation, plants are sampled along a transect over multiple generations.
A transect is drawn through the middle of the population with evenly spaced sampling points.
At each sampling point, we sample the plant closest to the sampling point. If no plant is within 1m of the sampling point, then no plant is recorded.
This is repeated for some number of years back into the past from the final generation.

This makes certain assumptions that it is good to be explicit about:

There are no differences in fitness between genotypes, or genotype-by-environment interactions for fitness with a heterogeneous landscape.
Population density is even across the landscape except for random fluctuations.
Seed dispersal distances are exponentially distributed. I am hoping dispersal is primarily through gravity and is fairly short scale. If there is something more complicated happening, for example additional longer-distance dispersal by rodents, this could be modelled with some kind of mixture of distributions, This would complicate things.
Outcrossing is random. In reality there will be some kind of pollen dispersal kernel shape, but I have no idea how that should look here.
Seed dispersal and outcrossing rates/distances do not change through time.

Functions in simmiad simulate populations given a set of input parameters:

mean_dispersal_distance Mean seed dispersal distance in metres.
outcrossing_rate Probability that an individual is outcrossed.
n_generations Number of generations to run the simulations.
n_starting_genotypes Number of initial genotypes to start with.
density Average density of plants per square metre.

To simulate a single population you can use sim_population. For example, this runs a single simulation of a population with 3x3=9 plants of 124 genotypes for 100 generations, with mean dispersal distance of 3m and an outcrossing rate of 1%.

library('simmiad')
set.seed(124) # so you get the same answer as me

# Set input parameters
mean_dispersal_distance = 0.5
outcrossing_rate = 0.01
n_generations = 10
n_starting_genotypes = 10
density = 1
how_far_back <- n_generations
n_sample_points = 30
sample_spacing = 5

sm <- sim_population(
  mean_dispersal_distance = mean_dispersal_distance,
  outcrossing_rate = outcrossing_rate,
  n_generations = n_generations,
  n_starting_genotypes = n_starting_genotypes,
  density = density,
  n_sample_points = n_sample_points,
  sample_spacing = sample_spacing,
  )

This returns a list of genotypes in each generation. The final generation looks like this:

 [1] NA         NA         NA         "g2"       NA         "g8"       "g8"       NA         "g1_3.363"
[10] "g8"       "g5"       "g7"       "g1"       "g10"      "g1"       "g4"       "g7"       "g3"      
[19] NA         "g2_5.77"  "g4"       "g9"       "g4"       "g3"       "g6"       "g5"       "g1"      
[28] NA         "g3"       "g1"

'g' stands for genotype, and is followed by a number between 1 and 10 indicating the id of the initial genotype.
Individuals 9 and 20 show what happens if outcrossing occurs: the genotype label is appended by the generation outcrossing occured and a unique integer within that generation. That ensures every outcrossed genotype is a new unique label. Note that if outcrossed genotypes outcross again the names will keep getting longer (and messier!).
The NA entries are sampling points where no plant could be sampled (i.e. there was no plant within one metre of the sampling point).

Most of the time you will want to simulate multiple replicate populations with a set of input parameters. This can be done with the function simmiad using similar input parameters as before.

rs <- simmiad(
  mean_dispersal_distance = 0.5,
  outcrossing_rate = 0.001,
  n_generations = 12,
  n_starting_genotypes = 10,
  density = 3,
  n_sample_points = 5,
  sample_spacing = 2,
  nsims = 3,
  how_far_back = 9
)

This function simulates multiple individual populations through time, and returns a list of different data:

parameters A data.frame giving input parameters.
clustering The covariance between distance along the transect and the frequency of identical genotypes.
matching_pairs: The number of pairs of identical genotypes in the transect.
count_NA: The number of empty sampling points.
n_genotypes: The number of unique genotypes sampled in the transect (note that this will be different from what you gave as n_starting_genotypes, because the latter reflects genotypes in the whole population, not just in the transect).
stability: How often individual sampling points are occupied by the same genotype in the final generations and 1, 2, ..., n generations back.
distance_identity: Probabilities of finding identical genotypes in pairs of sampling points at all possible distances between transects. For example, if there are five evenly spaced sampling points as in the example above, there are four possible distances between sampling points. Rows indicate replicate simulations.

In points 2 to 6 above, rows show replicate simulations and columns show generations.

Tom Ellis (thomas.ellis@gmi.oeaw.ac.at)

simmiad is available under the MIT license. See LICENSE for more information.

ellisztamas/simmiad documentation built on Dec. 12, 2023, 5:32 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ellisztamas/simmiad
Simulations of Emmer wheat at the Ammiad kibbutz

README.md
In ellisztamas/simmiad: Simulations of Emmer wheat at the Ammiad kibbutz

simmiad

Table of contents

Introduction

Installation

From GitHub

Dependencies

How simulations work

Initial generation:

Simulating through time:

Transect samples

Assumptions

Usage

Simulate a single population

Replicate simulations

Author and license information

R Package Documentation

Browse R Packages

We want your feedback!

ellisztamas/simmiad Simulations of Emmer wheat at the Ammiad kibbutz

README.md In ellisztamas/simmiad: Simulations of Emmer wheat at the Ammiad kibbutz

simmiad

Table of contents

Introduction

Installation

From GitHub

Dependencies

How simulations work

Initial generation:

Simulating through time:

Transect samples

Assumptions

Usage

Simulate a single population

Replicate simulations

Author and license information

R Package Documentation

Browse R Packages

We want your feedback!

ellisztamas/simmiad
Simulations of Emmer wheat at the Ammiad kibbutz

README.md
In ellisztamas/simmiad: Simulations of Emmer wheat at the Ammiad kibbutz