simfixoutbreak: Simulate evolutionary dynamics on a given transmission tree

Description Usage Arguments Details Value Examples

View source: R/simfixoutbreak.r

Description

Simulate within-host evolutionary dynamics on top of an existing transmission tree and generate genomic samples.

Usage

1
2
3
4
simfixoutbreak(ID,inf.times, rec.times, inf.source, mut.rate, equi.pop=10000, shape=flat,
           inoc.size=1, imp.var=25, samples.per.time=1, samp.schedule="random", 
           samp.freq=500, full=FALSE, feedback=500, glen=100000, 
           ref.strain=NULL, ...)

Arguments

ID

Vector of unique IDs.

inf.times

Vector of (integer) infection times.

rec.times

Vector of (integer) removal times.

inf.source

Vector of infection sources. The ith entry corresponds to the ID of the source of infection. For importations, the source should be 0.

mut.rate

Mutation rate (per genome per generation).

equi.pop

Equilibrium effective population size of pathogens within-host.

shape

Function describing the population growth dynamics. See Details.

inoc.size

Size of pathogen inoculum at transmission.

imp.var

The expected number of mutations separating unconnected importations.

samples.per.time

Number of samples taken at sampling times.

samp.schedule

How should sampling be conducted? Accepted values are: "calendar" - samples are taken from all current infectives every samp.freq generations; "individual" - samples are taken from each infective at intervals of samp.freq after infection; "random" - samples are taken at one time between infection and removal for each infective.

full

Should ‘full’ genomic sampling be returned? That is, should a vector of genotypes and their respective frequencies be stored from each individual's sampling times?

samp.freq

Number of generations between each sampling time (see samp.schedule).

feedback

Number of generations between simulation updates returned to R interface.

glen

Length of genome.

ref.strain

Initial sequence. By default, a random sequence of length glen.

...

Additional arguments to be passed to the shape function.

Details

Population growth dynamics are defined by the function called by 'shape'. This function returns the expected population size at each time step, given the total simulation time. By default, the population is expected to grow exponentially until reaching an equilibrium level, specified by equi.pop (flat). Alternatively, the population can follow a sinusoidal growth curve, peaking at runtime/2 (hump). User-defined functions should be of the form function(time,span,equi.pop,...), where span is equal to the duration of infection in this setting.

Value

Returns a list of outbreak data:

epidata

A matrix of epidemiological data with columns: person ID, infection time, removal time, source of infection.

sampledata

A matrix of genome samples with columns: person ID, sampling time, genome ID.

libr

A list with an entry for each unique genotype observed. Each entry is a vector of mutation positions relative to the reference genome.

nuc

A list with an entry for each unique genotype observed. Each entry is a vector of nucleotide types (integer between 1 and 4).

librstrains

A vector of unique genotype IDs corresponding to the libr object.

endtime

End time of the outbreak.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# Simulate a transmission chain
inf.times <- (0:20)*100
rec.times <- inf.times + 100 + rpois(21,50)
inf.source <- 0:20
inf.source[c(3,11)] <- 0 # Two importations
mut.rate <- 0.001

# Now simulate evolutionary dynamics and samples on top of this tree
W <- simfixoutbreak(ID=1:21, inf.times, rec.times, inf.source, mut.rate, equi.pop=1000, shape=flat,
                    inoc.size=10, imp.var=25, samples.per.time=5, samp.schedule="random", 
                    samp.freq=500, full=FALSE, feedback=100, glen=100000, 
                    ref.strain=NULL)

sampledata <- W$sampledata
epidata <- W$epidata

# Calculate distance matrix for observed samples
distmat <- gd(sampledata[,3], W$libr, W$nuc, W$librstrains)

# Now pick colors for sampled isolates
colvec <- rainbow(1200)[1:1000] # Color palette
refnode <- 1 # Compare distance to which isolate?
colv <- NULL # Vector of colors for samples
maxD <- max(distmat[,refnode])

for (i in 1:nrow(sampledata)) {
  colv <- c(colv, 
            colvec[floor((length(colvec)-1)*(distmat[refnode,i])/maxD)+1])
}


plotoutbreak(epidata, sampledata, col=colv, stack=TRUE, arr.len=0.1, 
             blockheight=0.5, hspace=60, label.pos="left", block.col="grey", 
             jitter=0.004, xlab="Time", pch=1) 

seedy documentation built on May 29, 2017, 10:58 a.m.