knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )

`nosoiSim`

objectThe output of a `nosoi`

simulation is an object of class `nosoiSim`

that contains all the data generated during the simulation into the following slots:

`total.time`

is the number of time steps (an integer) during which the simulation has run.`type`

is the simulation type that was ran (either "single" or "dual" host).`host.info.A`

is an object of the class`nosoiSimOne`

, itself a list of elements concerning host type A (or the unique host type in case of a single-host simulation), with the following slots:`N.infected`

is the final number (an integer) of infected hosts.`table.hosts`

is a table containing the results of the simulations focusing on each individual host. For more details on its structure, see below.`table.state`

is a table containing the results of the simulations concerning the movement history of the infected hosts. For more details on its structure, see below.`prefix.host`

is the string containing the prefix used to name the hosts (user-defined`prefix.host`

in the`nosoiSim`

function).`popStructure`

is the population structure used (either "none", "discrete" or "continuous").`host.info.B`

is the same as above, but for host type B (but is set to`NA`

if a single-host was run).

Host-specific elements (i.e. in `host.info.A`

or `host.info.B`

) in this object can be easily extracted using the function `getHostData`

:

```
getHostData(nosoi.output, what, pop)
```

This function takes as its arguments your nosoi output object (of class `nosoiSim`

), `what`

you want to extract (either `N.infected`

, `table.hosts`

, `table.state`

or `prefix.host`

), and which host type (here `pop`

) you are interested in (`A`

by default, but can be `B`

in case of dual-host simulations).

`table.hosts`

`table.hosts`

, a `data.table`

object, contains information about each host that has been simulated. In this object, one row corresponds to one host. This object can be directly extracted using the function `getTableHosts`

:

```
getTableHosts(nosoi.output, pop)
```

This function takes as its arguments your nosoi output object (of class `nosoiSim`

) and which host type (here `pop`

) you are interested in (`A`

by default but can be `B`

in case of dual-host simulations).

The structure of the table is the following:

`hosts.ID`

is the unique identifier for the host (a string of characters), based on a user-defined prefix and an integer.`inf.by`

is the unique identifier of the host that infected the current one (stands for "infected by").`inf.in`

(only if structure is present) is the (discrete) location (e.g. province, country, ...) or coordinates (in that case`inf.in.x`

and`inf.in.y`

) in which the host was initially infected (stands for "infected in").`current.in`

(only if structure is present) is the (discrete) location (e.g. province, country, ...) or coordinates (in that case`current.in.x`

and`current.in.y`

) in which the host is at the end of the simulation (stands for "currently in"). A complete history of hosts movements is stored in the other table`table.state`

(see following section).`current.env.value`

(only if continuous structure is present) is the environmental value (i.e. raster cell value) in which the host is at the end of the simulation (stands for "current environmental value"). A complete history of the environmental value encountered by the host is stored in the other table`table.state`

(see following section).`current.cell.raster`

(only if continuous structure is present) is the raster cell numeric identifier in which the host is at the end of the simulation. Here also, a complete history of raster cell occupied during the simulation is stored in the other table`table.state`

(see following section).`host.count`

(only if structure is present) is the host count in the current state or raster cell. This value is only updated if the host count is actually used as a parameter in the simulation.`inf.time`

is the time (in time steps) at which the host entered the simulation (i.e. got infected) (stands for "infection time").`out.time`

is the time (in time steps) at which the host exited the simulation.`active`

is a number (either 0 or 1) indicating if the host is still active at the end of the simulation (1 for yes, 0 for no).- the rest stores the user-defined individual-specific parameters used by the functions (i.e. specified by the
`param`

arguments in the simulator's functions).

`table.state`

`table.state`

, a `data.table`

object, contains information about the history of movement of each host that has been simulated. This table is available only if a structured population was simulated (either discrete or continuous). It can be directly extracted using the function `getTableState`

:

```
getTableState(nosoi.output, pop)
```

This function takes as its arguments your nosoi output object (of class `nosoiSim`

) and which host type (here `pop`

) you are interested in (`A`

by default but can be `B`

in case of dual-host simulations).

The structure of the table is the following:

`hosts.ID`

is the unique identifier for the host (a string of characters), based on a user-defined prefix and an integer.`state`

is the (discrete) location (e.g. province, country, ...) or coordinates (in that case`state.x`

and`state.y`

) in which the host is located during the time interval.`current.env.value`

(only if continuous structure is present) is the environmental value (i.e. raster cell value) in which the host is at the end of the simulation (stands for "current environmental value").`current.cell.raster`

(only if continuous structure is present) is the raster cell numeric identifier in which the host is at the end of the simulation.`time.from`

is the time (in time steps) at which the host entered (i.e. got infected or moved) into the state.`time.to`

is the time (in time steps) at which the host left (i.e. excited or moved) out of the state.

By simulating transmission chains, `nosoi`

also simulates an epidemic process. `nosoi`

provides some solutions to easily explore the dynamics of this epidemic process by following through time the number of active infected hosts as well as the cumulative number of infected hosts. It also allows to compute the exact basic reproduction number $R_0$, defined as the average number of cases one case generates. Since all of the data generated are stored in the output `nosoiSim`

object, more advanced exploration could also be done.

`summary`

is a turnkey solution that computes both the dynamics of the epidemic and $R_0$. Its only argument is your nosoi output object (of class `nosoiSim`

):

```
summary(nosoi.output)
```

It returns a list containing the slots:

`R0`

, a sub-list with number of inactive hosts at the end of the simulation`N.inactive`

on which the calculation is done, mean $R_0$`R0.mean`

, and $R_0$ distribution`R0.dist`

. For more details on the computation behind, see below.`dynamics`

is a`data.table`

object with the count of the sum of infected and active host at each time of the simulation (by location/state if this simulation has been done in a discrete structured host population). See below for more details.`cumulative`

is a`data.table`

object with the cumulative sum of infected hosts at each time step of the simulation. See below for more details.

Both the `cumulative`

and `dynamics`

tables can be directly extracted using the functions `getCumulative`

and `getDynamic`

respectively. Both functions yield a `data.table`

object, but their structures vary slightly. `cumulative`

has the following structure:

`t`

is the time step (an integer).`Count`

is the cumulative number of infected hosts at given time step (an integer).`type`

is the host type, identified by its user-defined prefix (string of characters).

`dynamics`

has the following structure:

`state`

(only in the case of a discrete structure) is the state (a string of characters, usually).`Count`

is the number of active infected hosts at given time step (an integer) by state (in the case of a discrete structure).`type`

is the host type, identified by its user-defined prefix (string of characters).`t`

is the time step (an integer).

These can be used to plot the epidemiological dynamics of the simulated transmission chain. Here for example, we simulate a single introduction of a single-host pathogen between 3 different locations/states (named "A", "B", and "C"), with constant exit and move probabilities as well as a number of contacts dependent on the location and host count in each location:

library(nosoi) t_incub_fct <- function(x){rnorm(x,mean = 5,sd=1)} p_max_fct <- function(x){rbeta(x,shape1 = 5,shape2=2)} p_Move_fct <- function(t){return(0.1)} p_Exit_fct <- function(t){return(0.05)} proba <- function(t,p_max,t_incub){ if(t <= t_incub){p=0} if(t >= t_incub){p=p_max} return(p) } time_contact <- function(t, current.in, host.count){ temp.val = 30 - host.count if(temp.val <= 0) { return(0) } if(temp.val >= 0) { if(current.in=="A"){ return(round((temp.val/30)*rnorm(1, 3, 1), 0))} if(current.in=="B"){return(0)} if(current.in=="C"){ return(round((temp.val/30)*rnorm(1, 6, 1), 0))} } } transition.matrix = matrix(c(0,0.2,0.4,0.5,0,0.6,0.5,0.8,0),nrow = 3, ncol = 3,dimnames=list(c("A","B","C"),c("A","B","C"))) set.seed(1050) test.nosoiA <- nosoiSim(type="single", popStructure="discrete", length=100, max.infected=200, init.individuals=1, init.structure="A", structure.matrix=transition.matrix, pMove=p_Move_fct, param.pMove=NA, diff.nContact=TRUE, hostCount.nContact=TRUE, nContact=time_contact, param.nContact=NA, pTrans = proba, param.pTrans = list(p_max=p_max_fct, t_incub=t_incub_fct), pExit=p_Exit_fct, param.pExit=NA )

library(ggplot2) cumulative.table <- getCumulative(test.nosoiA) dynamics.table <- getDynamic(test.nosoiA) ggplot(data=cumulative.table, aes(x=t, y=Count)) + geom_line() + theme_minimal() + labs(x="Time (t)",y="Cumulative count of infected hosts")

if (!requireNamespace("ggplot2", quietly = TRUE)) { message("Package 'ggplot2' is needed for plotting this figure.") } else { library(ggplot2) cumulative.table <- getCumulative(test.nosoiA) dynamics.table <- getDynamic(test.nosoiA) ggplot(data=cumulative.table, aes(x=t, y=Count)) + geom_line() + theme_minimal() + labs(x="Time (t)",y="Cumulative count of infected hosts") }

ggplot(data=dynamics.table, aes(x=t, y=Count, color=state)) + geom_line() + theme_minimal() + labs(x="Time (t)",y="Number of active infected hosts")

if (!requireNamespace("ggplot2", quietly = TRUE)) { message("Package 'ggplot2' is needed for plotting this figure.") } else { ggplot(data=dynamics.table, aes(x=t, y=Count, color=state)) + geom_line() + theme_minimal() + labs(x="Time (t)",y="Number of active infected hosts") }

The output `nosoiSim`

object can be used to compute the "real" $R_0$, defined as the average number of cases one case generates, often estimated in epidemiological studies. The function `getRO`

can be used directly, with the `nosoiSim`

output as its unique argument, to generate a list containing:

`N.inactive`

, the number of inactive hosts at the end of the simulation on which the $R_0$ calculation is made. This is to avoid bias introduced by hosts that have not accomplished their full transmission potential (i.e. are still active).`R0.mean`

is the $R_0$ estimate, mean of the following`R0.dist`

.`R0.dist`

is the full distribution (one number is one host) of $R_0$ estimates per host.

Here for example, we simulate a single introduction of a single-host pathogen between 3 different states (named "A", "B", and "C"), with a constant exit and move probability as well as a number of contacts dependent of the location and host count in each location:

library(nosoi) t_incub_fct <- function(x){rnorm(x,mean = 5,sd=1)} p_max_fct <- function(x){rbeta(x,shape1 = 5,shape2=2)} p_Move_fct <- function(t){return(0.1)} p_Exit_fct <- function(t){return(0.05)} proba <- function(t,p_max,t_incub){ if(t <= t_incub){p=0} if(t >= t_incub){p=p_max} return(p) } time_contact <- function(t, current.in, host.count){ temp.val = 30 - host.count if(temp.val <= 0) { return(0) } if(temp.val >= 0) { if(current.in=="A"){ return(round((temp.val/30)*rnorm(1, 3, 1), 0))} if(current.in=="B"){return(0)} if(current.in=="C"){ return(round((temp.val/30)*rnorm(1, 6, 1), 0))} } } transition.matrix = matrix(c(0,0.2,0.4,0.5,0,0.6,0.5,0.8,0),nrow = 3, ncol = 3,dimnames=list(c("A","B","C"),c("A","B","C"))) set.seed(1050) test.nosoiA <- nosoiSim(type="single", popStructure="discrete", length=100, max.infected=200, init.individuals=1, init.structure="A", structure.matrix=transition.matrix, pMove=p_Move_fct, param.pMove=NA, diff.nContact=TRUE, hostCount.nContact=TRUE, nContact=time_contact, param.nContact=NA, pTrans = proba, param.pTrans = list(p_max=p_max_fct, t_incub=t_incub_fct), pExit=p_Exit_fct, param.pExit=NA )

```
getR0(test.nosoiA)
```

As you can see, out of the 117 inactive hosts, the mean $R_0$ is 1.068376. This does not of course reflect the distribution of $R_0$ where most hosts actually never transmitted the infection:

data = data.frame(R0=getR0(test.nosoiA)$R0.dist) ggplot(data=data, aes(x=R0)) + geom_histogram() + theme_minimal()

data = data.frame(R0=getR0(test.nosoiA)$R0.dist) if (!requireNamespace("ggplot2", quietly = TRUE)) { message("Package 'ggplot2' is needed for plotting this figure.") } else { ggplot(data=data, aes(x=R0)) + geom_histogram() + theme_minimal() }

The transmission chain is the main product of a `nosoi`

simulation.
It can be extracted or visualized as such using the `table.hosts`

table that links hosts in time.
It can also be extracted in a form mimicking a phylogenetic tree,
and visualized or saved as such using available tools for phylogenetic trees
such as `ape`

and `tidytree`

.
To do so, you can use the `getTransmissionTree()`

function,
which arguments are the `nosoi`

simulation output and which host type would be
the tips of the tree ("A" by default, or "B" in case a dual-host simulation).

Formally, the transmission tree is extracted as a dated phylogenetic tree where:

- each branch represents the life span of an host;
- each internal node represents a transmission event;
- each tip represents the exiting point of an host;
- the root represents the entering point of the first host.

Such a tree is binary, and has as many tips as the total number of infected hosts, and as many nodes as the number of transmission events. It spans a time going from the first entry of the first host (usually, by convention, 0), to the exiting of the last host.

It can be seen as a proxy to represent the molecular evolution of the pathogen infecting each of the hosts.

`getTransmissionTree()`

extracts the full transmission tree. In case of a big simulated epidemic, this can take some time. It can also be very complicated to plot/visualize.

As an example, we simulate a single introduction of a single-host pathogen between 3 different states (named "A", "B", and "C"), with a constant exit and move probability as well as a number of contacts dependent of the location and host count in each location:

library(nosoi) t_incub_fct <- function(x){rnorm(x,mean = 5,sd=1)} p_max_fct <- function(x){rbeta(x,shape1 = 5,shape2=2)} p_Move_fct <- function(t){return(0.1)} p_Exit_fct <- function(t){return(0.05)} proba <- function(t,p_max,t_incub){ if(t <= t_incub){p=0} if(t >= t_incub){p=p_max} return(p) } time_contact <- function(t, current.in, host.count){ temp.val = 30 - host.count if(temp.val <= 0) { return(0) } if(temp.val >= 0) { if(current.in=="A"){ return(round((temp.val/30)*rnorm(1, 3, 1), 0))} if(current.in=="B"){return(0)} if(current.in=="C"){ return(round((temp.val/30)*rnorm(1, 6, 1), 0))} } } transition.matrix <- matrix(c(0,0.2,0.4,0.5,0,0.6,0.5,0.8,0),nrow = 3, ncol = 3,dimnames=list(c("A","B","C"),c("A","B","C"))) set.seed(1050) test.nosoiA <- nosoiSim(type="single", popStructure="discrete", length=100, max.infected=200, init.individuals=1, init.structure="A", structure.matrix=transition.matrix, pMove=p_Move_fct, param.pMove=NA, diff.nContact=TRUE, hostCount.nContact=TRUE, nContact=time_contact, param.nContact=NA, pTrans = proba, param.pTrans = list(p_max=p_max_fct, t_incub=t_incub_fct), pExit=p_Exit_fct, param.pExit=NA )

The simulation runs for `r test.nosoiA$total.time`

time steps
and infected `r getHostData(test.nosoiA, "N.infected")`

hosts.
The following transmission tree can thus be extracted
(as a `tidytree::treedata`

object) and plotted using
`ggtree`

:

test.nosoiA.tree <- getTransmissionTree(test.nosoiA) library(ggplot2) library(ggtree) ggtree(test.nosoiA.tree, color = "gray30") + geom_nodepoint(aes(color=state)) + geom_tippoint(aes(color=state)) + theme_tree2() + xlab("Time (t)") + theme(legend.position = c(0,0.8), legend.title = element_blank(), legend.key = element_blank())

if (!requireNamespace("ape", quietly = TRUE) || !requireNamespace("tidytree", quietly = TRUE) || !requireNamespace("treeio", quietly = TRUE)) { message("Packages 'ape', 'tidytree' and 'treeio' are needed for transmission tree generation.") } else { test.nosoiA.tree <- getTransmissionTree(test.nosoiA) } if (!requireNamespace("ggplot2", quietly = TRUE) || !requireNamespace("ggtree", quietly = TRUE) || (utils::packageVersion("ggtree") < "3.1.3")) { message("Packages 'ggplot2' and 'ggtree (> 3.1.3)' are needed for plotting this figure.") } else { library(ggplot2) library(ggtree) ggtree(test.nosoiA.tree, color = "gray30") + geom_nodepoint(aes(color=state)) + geom_tippoint(aes(color=state)) + theme_tree2() + xlab("Time (t)") + theme(legend.position = c(0,0.8), legend.title = element_blank(), legend.key = element_blank()) }

Each color point represents the location/state, either at transmission (a node)
or a host when it exits the simulation (or the end point of the simulation; a tip).
This transmission tree is timed (time steps of the simulation).
As you can see, no transmission ever occurs at location "B" (green);
this is coherent with the chosen `nContact`

function, where no infectious contact
occurs when the host is in "B".

Usually, it is unlikely that the whole transmission chain (i.e. every host) will be sampled during surveillance of an epidemic outbreak or endemic transmission. The functions `sampleTransmissionTree()`

and `sampleTransmissionTreeFromExiting()`

can both be used to sample hosts from this transmission chain and construct the new tree based on the existing hosts (tips).

`sampleTransmissionTree()`

needs the following arguments:

`nosoiInf`

, the output of simulation (`nosoiSim`

object)`tree`

, a`treedata`

object created by the function`getTransmissionTree()`

(see above).`samples`

, a`data.table`

object with the following entries:`hosts`

, unique`Host.ID`

of the individuals to be sampled.`times`

, time at which hosts are sampled.`labels`

, name of the corresponding tip in the tree

In the example above, we want to sample the following 20 hosts:

if (!requireNamespace("ape", quietly = TRUE) || !requireNamespace("tidytree", quietly = TRUE) || !requireNamespace("treeio", quietly = TRUE)) { message("Packages 'ape', 'tidytree' and 'treeio' are needed for transmission tree generation.") } else { library(dplyr) table.hosts.testA <- getTableHosts(test.nosoiA) setkey(table.hosts.testA, "hosts.ID") set.seed(30082019) sampled.hosts <- sample(table.hosts.testA$hosts.ID, 20) samples.data.table <- table.hosts.testA[sampled.hosts] %>% group_by(hosts.ID) %>% mutate(Sampled.time = round(runif(1, min=inf.time,max=out.time), digits = 0), labels = paste(hosts.ID,"sampled",sep="-")) samples.data.table <- samples.data.table[c("hosts.ID","Sampled.time","labels")] colnames(samples.data.table) <- c("hosts", "times", "labels") samples.data.table }

test.nosoiA.tree.sampled <- sampleTransmissionTree(test.nosoiA, test.nosoiA.tree, samples.data.table)

if (!requireNamespace("ape", quietly = TRUE) || !requireNamespace("tidytree", quietly = TRUE) || !requireNamespace("treeio", quietly = TRUE)) { message("Packages 'ape', 'tidytree' and 'treeio' are needed for transmission tree generation.") } else { test.nosoiA.tree.sampled <- sampleTransmissionTree(test.nosoiA, test.nosoiA.tree, samples.data.table) }

As before, the tree obtained is a `treedata`

object, and can be plotted:

ggtree(test.nosoiA.tree.sampled, color = "gray30") + geom_nodepoint(aes(color=state)) + geom_tippoint(aes(color=state)) + geom_tiplab(aes(label=host)) + theme_tree2() + xlab("Time (t)") + theme(legend.position = c(0,0.8), legend.title = element_blank(), legend.key = element_blank())

if (!requireNamespace("ggtree", quietly = TRUE) || (utils::packageVersion("ggtree") < "3.1.3")) { message("Package 'ggtree' is needed for plotting this figure.") } else { ggtree(test.nosoiA.tree.sampled, color = "gray30") + geom_nodepoint(aes(color=state)) + geom_tippoint(aes(color=state)) + geom_tiplab(aes(label=host)) + theme_tree2() + xlab("Time (t)") + theme(legend.position = c(0,0.8), legend.title = element_blank(), legend.key = element_blank()) }

Alternatively, you can sample among exited hosts (i.e. no longer active at the end of the simulation) using the function `sampleTransmissionTreeFromExiting()`

, mimicking a sampling procedure that is destructive or cuts the hosts from the population. Be careful however, as it does not influence the epidemiological process: the hosts are only sampled when exiting the simulation.

`sampleTransmissionTreeFromExiting()`

needs the following arguments:

`tree`

, a`treedata`

object created by the function`getTransmissionTree()`

(see above).`hosts`

, a vector with the hosts.ID of the exited hosts to sample.

In our example, we want to sample these samples:

if (!requireNamespace("ape", quietly = TRUE) || !requireNamespace("tidytree", quietly = TRUE) || !requireNamespace("treeio", quietly = TRUE)) { message("Packages 'ape', 'tidytree' and 'treeio' are needed for transmission tree generation.") } else { set.seed(5905950) sampled.hosts2 = subset(table.hosts.testA, active == 0)$hosts.ID sampled.hosts <- sample(sampled.hosts2, 20) sampled.hosts }

test.nosoiA.tree.sampled.exiting <- sampleTransmissionTreeFromExiting(test.nosoiA.tree, sampled.hosts)

if (!requireNamespace("ape", quietly = TRUE) || !requireNamespace("tidytree", quietly = TRUE) || !requireNamespace("treeio", quietly = TRUE)) { message("Packages 'ape', 'tidytree' and 'treeio' are needed for transmission tree generation.") } else { test.nosoiA.tree.sampled.exiting <- sampleTransmissionTreeFromExiting(test.nosoiA.tree, sampled.hosts) }

As before, the tree obtained is a `treedata`

object, and can be plotted:

ggtree(test.nosoiA.tree.sampled.exiting, color = "gray30") + geom_nodepoint(aes(color=state)) + geom_tippoint(aes(color=state)) + geom_tiplab(aes(label=host)) + theme_tree2() + xlab("Time (t)") + theme(legend.position = c(0,0.8), legend.title = element_blank(), legend.key = element_blank())

if (!requireNamespace("ape", quietly = TRUE) || !requireNamespace("tidytree", quietly = TRUE) || !requireNamespace("treeio", quietly = TRUE) || !requireNamespace("ggtree", quietly = TRUE) || (utils::packageVersion("ggtree") < "3.1.3")) { message("Packages 'ape', 'tidytree', 'treeio' and 'ggtree' are needed for transmission tree generation.") } else { ggtree(test.nosoiA.tree.sampled.exiting, color = "gray30") + geom_nodepoint(aes(color=state)) + geom_tippoint(aes(color=state)) + geom_tiplab(aes(label=host)) + theme_tree2() + xlab("Time (t)") + theme(legend.position = c(0,0.8), legend.title = element_blank(), legend.key = element_blank()) }

All the functions mentioned above produce a `tidytree::treedata`

object.
It is a phylogenetic tree, with nodes and tips annotated with all the characteristics
of the epidemics, including the geographical location when applicable.
This format is described in details by his developer in
this ebook.

This makes it easy to export the generated data to other software for downstream
analyzes, thanks to the package treeio.
For instance, the tree can be written in a BEAST
compatible format thanks to function `treeio::write.beast()`

:

treeio::write.beast(test.nosoiA.tree.sampled.exiting)

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.