In AndySouth/resistance: an insecticide resistance population genetics model with 2 loci and 2 insecticides

Notes on insecticide resistance model for liverpool. andy south 27/1/15

Timelines and deliverables Days 1 to 12: Read and assimilate the Curtis paper cited above. Run the LSTM code written by Levick, replicate the results from the Curtis paper, and prepare the code for sensitivity analysis by firstly refactoring into an R package and creating a Github repository. Identify appropriate parameter ranges and any rules that should be applied while sampling (e.g. male exposure to insecticide within homes should not be higher than for the more endophilic females). Run sensitivity analysis and present the results as classification trees showing what parameter combinations favour which deployment strategy.

Deliverables: the provision of sensitivity analyses and classification trees that will allow us to publish the model in a peer-reviewed journal.

Days 13 to 25. This component is to use the full model rather than the sub-model used to replicate the Curtis results. Liaise with IVCC to identify the policy question they consider the most pressing and the likely end users of the simulation package. Perform sensitivity analysis on the full model as above. Attempt to put a user-friendly interface onto the simulation package using the “Shiny R” package.

Deliverables: sensitivity analysis of the complete model and a reasoned consideration (or delivery) of the feasibility of distributing the package with a graphical interface for use by users with no computing experience.

27/1/15 7.5hrs

#working version of Beths code
#run it outside of this project because it changes wd
source("C:\\Dropbox\\Ian and Andy\\andy\\malaria\\Beth code\\malaria_code_beth.r")

Editing malaria_code_andy.r, indenting functions etc.

Much of the script from line 464 is setting up parameters for example runs.

input[a,b] : a=parameter number and b=scenario number

464 set input params 1058-2716 run model 2718 Actions needing full results.list (e.g. curtis plots use multiple scenarios)

edited bookmarks to navigate around the file in RStudio.

9/3/15 4-5 1hr

10/3/15 4.15 - 7.15 3hrs

listFunctions <- function(filename) {
  temp.env <- new.env()
  sys.source(filename, envir = temp.env)
  functions <- lsf.str(envir=temp.env)
  rm(temp.env)
  return(functions)
}

#listFunctions("C:\\Dropbox\\Ian and Andy\\andy\\malaria\\Beth code\\malaria_code_beth.r")

So there are only 11 functions.

allele.freq : function (mat)
curtis_f1 : function (nrelaxmat, relaxmat, gencol, r1col)
curtis_f2 : function (combmat, bmat, amat, gencol, r1col, r2col)
curtis_ld : function (resultsmat, relaxedmat, gencol, ldcol)
haplotype : function (mat)
HW : function (P, mat)
linkage : function (mat)
make.genotypemat : function (P_1, P_2)
make.matrix : function (mat, rnames)
singlealleleFrequency : function (locus, max_gen, results.list, input)
timetoFifty : function (locus, max_gen, results.list, input

I can create an initial github repo for the files from Beth (&this notes file) That means I will have version control for my initial restructuring. Later I may want to create a new repo for the reformatted code as a package.

What to call the first repo ? resistance

created repo on github
created RStudio project from github repo
To get ssh push working.
RStudio Tools, Shell
git remote set-url origin git@github.com:AndySouth/resistance.git

11/3/15 9.30-12.30 3hrs 2.15-6.45 7.5hrs

moved my dropbox folder back to C.

Plan 1. move Beths functions out into a single file for each in an R folder 1. edit Beths commments into roxygen format 1. only later think about changing functions and documentation to follow best practice

Changed plan slightly by appending plot onto start of plot functions.

To source files in the R folder lapply(dir("R"),function(x) source(paste0("R//",x)))

12/3/15 9.30-14 4.5hrs 3-5 6-7 7.5hrs

The model generates 3 matrices : results, genotype, fitness 1. Results : freq of R allele at each loci in each sex and linkage disequilibrium of R allele in each sex, per generation 2. Genotype : frequencies of each of the ten genotypes, per generation 3. Fitness : fitness scores of each genotype/niche combination (table 4. of Main Document)

When a file is input with multiple scenarios, each of the three matrices is stored in a list, where scenario number gives position of the matrix in the list.

results.list : fitness.list : genotype.list :

Can I put these into a single list so that they can be returned from a single function ?

done ~ put input object creation into a function But there is still a fair bit of code that reads parameter values out of the input matrix and into named variables.

I might be able to restructure these collections of single variables into arrays. e.g. for the exposure levels of m&f

#to create an array a[sex][locus1][locus2]
sex <- c("F","M")
locus1 <- c("0","A","a")
locus2 <- c("0","B","b")
dimnames1 <- list( sex=sex, locus1=locus1, locus2=locus2 )
dim1 <- sapply(dimnames1, function(x) length(x))
a <- array(0,dim=dim1, dimnames=dimnames1)
a
length(a)
#to access one element
a['M','a','b']
#to check that male exposures total 1
sum(a['M',,])

  ## Exposure levels of males and females to each insecticide niche ##
  # males
  a.m_00 <- input[8,i]

  a.m_a0 <- input[9,i]
  a.m_A0 <- input[10,i]

  a.m_0b <- input[11,i]
  a.m_0B <- input[12,i]

  a.m_ab <- input[13,i]
  a.m_AB <- input[14,i]

  a.m_Ab <- input[15,i]
  a.m_aB <- input[16,i]

  #a.m <- sum(a.m_00, a.m_a0, a.m_A0, a.m_0b, a.m_0B, a.m_ab, a.m_AB, a.m_Ab, a.m_aB)
  #if ( a.m != 1 ){      
  # print( paste("Error in male exposures: must total one: ", a.m) )

  # }else{
  #     print( paste( "Male exposures total 1: ", a.m ))
  #     }


  # females
  a.f_00 <- input[17,i]

  a.f_a0 <- input[18,i]
  a.f_A0 <- input[19,i]

  a.f_0b <- input[20,i]
  a.f_0B <- input[21,i]

  a.f_ab <- input[22,i]
  a.f_AB <- input[23,i]

  a.f_Ab <- input[24,i]
  a.f_aB <- input[25,i]

I should check that what we are doing fits in with what others are doing. We may be able to capitalise on existing tools.

popgen on CRAN seems a bit old and doesn't do a huge amount, no vignette.

gstudio looks more promising. e.g. it defines locus objects. gstudio documentation A locus object is set like this, but I'm not sure if this would help us. It does have some stuff on drift at end of help.

require(gstudio)
loc <- locus(c("C", "A"))
loc

pegas provides functions for the analysis of allelic data and of haplotype data from DNA sequences. It requires and complements two other R-packages: ape and adegenet. pegas pegas data structures includes a class called loci. : An object of class "loci" is a data frame where rows represent individuals and columns are loci and optional additional variables. Again I can't quite see how it would be useful for us.

There is a HardyWeinberg package on CRAN. "The HardyWeinberg package consists of a set of tools for analyzing diallelic genetic markers, and is particularly focused on the graphical representation of their (dis)equilibrium condition in various ways."

There is a CRAN genetics task view.

The genetics package which also has a locus class, seems to be more detailed, about chromosomes etc. and probably not of use to us.

Structure of the results matrices

max_gen <- 2 #just as example

# Set up results matrix - prints overall freq of R and S allele per locus per sex, LD and overall allele freq (i.e. 1)
results <- matrix ( nrow = max_gen, ncol = 11 )
colnames( results ) <- c( "Gen", "m.R1", "m.R2", "m.LD","f.R1", "f.R2", "f.LD", "M", "F", "dprime", "r2" )

# set up fitness by niche matrix - records fitness scores for each niche for each genotype
fitness <- matrix ( nrow = 10, ncol = 9, c(rep(0,90)))
colnames(fitness) <- c( "-,-", "a,-", "A,-", "b,-", "B,-", "a,b", "A,B", "A,b", "a,B" )
rownames(fitness) <- c( "SS1SS2", "SS2RS2", "SS1RR2", 
                        "RS1SS2", "RS1RS2_cis", "RS1RS2_trans", "RS1RR2",
                        "RR1SS2", "RR1RS2", "RR1RR2")

# set up genotype matrix - records frequencies of each of the 9 two locus genotypes each generation
genotype <- matrix( nrow=max_gen, ncol=11 )
colnames(genotype) <- c("gen", "SS1SS2", "SS2RS2", "SS1RR2", 
                        "RS1SS2", "RS1RS2_cis", "RS1RS2_trans", "RS1RR2",
                        "RR1SS2", "RR1RS2", "RR1RR2")

Somehow it seems I've broken it, I'm getting NAs in results matrix.

The input files are different (but same structure) between Beths version that works and my new version ... calibration is 100 in my failing version and 1012 in Beths, i think this is because I'm reading in the csv when I shouldn't.

I seem not to get to line 889 in my version # male f.m.SS1SS2 <- genotype.freq[1,]

Then suddenly it seemed to start working when I put a browser in. It's something tricky around line 711, if it doesn't get to 889 it doesn't fill in the results matrix ??

Aha! It seems to have been this without the comments before the tilde's that stoppped it from working

  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~       
  ### Loop to run the model from the initial conditions generated above ####
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

13/3/15 9.15-12.15 3hrs

A reproducible version of yesterdays bug.

x<-1
~
for(i in 1:10) x<-x+1
cat(x)

R bug reporting

can I create a list of the 3 lists for results ? Yes Replace this
results.list <- list()
fitness.list <- list() genotype.list <- list() with
listOut <- list( results=list(), fitness=list(), genotype=list() )

Then find & replace
results.list with listOut$results
fitness.list with listOut$fitness genotype.list with listOut$genotype

16/3/15 mon 18.30-19.30 1hr cut out lines 92 to 1740 into a runModel() function, later can subdivide it up

17/3/15 tues to liverpool 8.30-5.30 9hrs

Assuming that resistance to each insecticide was encoded at one locus respectively, and that these can either have a homozygous susceptible (SS), heterozygous (RS) or homozygous resistant (RR) genotypes at each locus, there are then ten possible genotypes (including cis & trans forms of the double heterozygous).

We consider two insecticides (A and B) at three possible levels; absent, low(ab) and high(AB), resulting in 9 niches;

Locus1 relates to resistance for insecticide A, and locus2 for B. Locus2 does not effect fitness under exposure to A.

Males and females can be exposed differently.

Curtis states that if resistance to one insecticide is present at very low levels in the population, then resistance will rise slower if this insecticide is used on its own, rather than in a combination with a more established insecticide.

This is the important Curtis statement “the use of a mixture where the initial gene frequencies are unequal leads to more rapid increase in the frequency of the rarer of the genes”

A key aspect of Curtis’ prediction was that this relationship would be determined by an increasing association between the presence of the two resistance alleles, measured as high levels of linkage disequilibrium.

The results given above seem to be in contrast with Curtis’ assumption, and suggest that in fact using the insecticides in combination can slow the spread of resistance.

My summary : Curtis suggests resistance spreads more quickly with 2 insecticides, due to increasing association between the alleles. Beths work suggests that combination can slow spread of resistance.

Good figure 1 in manuscript : The rate of spread of IR depends on whether the resistance mutation is dominant, semi-dominant, or recessive.

aha! I've only just understood this bit At high concentrations only the RR individuals survive so resistance is recessive; as concentrations decline some RS mosquitoes survive making resistance semi-dominant; at low concentration both RR and RS mosquitoes survive, making resistance dominant. so main problems of resistance arise when insecticide concentrations decline

From manuscript about scenarios to run: row in params csv

Insecticide coverage for females: 0.1, 0.2, 0.6, 0.8, 0.9 (n=5) 9 & 15
Insecticide coverage for males: as for females, or half that of females (n=2) 18 & 24
‘New’ insecticide resistance starting freq: 0.0001, 0.001, 0.01 (n=3) 6
‘Old’ insecticide resistance starting freq: 0.05, 0.1, 0.2, 0.5, 0.8 (n=5) 7
Resistance heterozygous fitness: 0.5, 0.75. 1 (n=3 for each insecticide) ?28 & 30
Sensitive heterozygous fitness:0, 0.2, 0.4 (n=3 for each insecticide) ?27 & 29
Resistance allele dominance: 0, 0.2, 0.5, 0.9, 1 for each locus (n=5 for each locus) 32, 35 do to start
Fitness costs: 0, 0.05, 0.1 (n=3) 43&44
Fitness dominance: 0, 0.5, 1 (n=3) 33&36

BUT now we are going to do it by sampling from a uniform distribution before doing decision trees. Then you can progressively add results to a file.

I could start by creating a document with the results at the extremes of the parameter space.

Send Ian & Beth devtools installation instructions and a document.

Approx. 60,000 combinations. Too many???

would latin cube hypersampling be better?

The simulation output was the ratio of the time (in generation) for resistance to the ‘new’ insecticide to reach a given allele threshold frequency when the ‘new’ insecticide was deployed as combination, compared to the time taken to reach the same threshold if deployed on its own; a ratio >1 indicates that combination is better. Three threshold frequencies were investigated: 0.1, 0.25, 0.5.

These ratios was then analysed by regression to find out how the input parameters alter the ratio, and hence under what circumstance combination would be best/worst. Classification tress can also be done.

The intention was not to consider particular plausible deployment situations, but was to systematically investigate a wide range of parameter space to determine whether CD was better than single-insecticide deployment at reducing the rate at which the minor resistance form spreads.

Show my progress :

github repo
move functions to their own files and structure documentation
move parts of main file into functions
combine 3 outputs to a single list so that can be returned from a function
nearly able to create a package with structured documentation etc.

Questions for Beth & Ian

Curtis Table1 vi that shows resistance spreading faster with mixture.
is coverage same as exposure ? yes
what subset of parameters would they like me to vary to start
locations of sensitivity params in params.csv
what outputs are wanted for sensitivity scenarios ? num generations until resistance reaches threshold (0.1, 0.25 & 0.5) Curtis just used 0.5.

Talking with Ian & Beth

Sequential use.

For some scenarios, we'll need to run with one insecticide and then stop and start other.

When does mixture fail, until one resistance allele reaches critical point, then continue using 2nd insecticide until resistance for that one reaches critical point too.

Could create a version of timetoFifty that accepts threshold param.

Ian has created list of defaults in most recent manuscript.

The Curtis version is going to be simpler. Because not all param variations are included.

18/3/15 weds in liverpool 8.30-6 9.5hrs

Ian created suggested_sensitivity_analysis.doc

I created a variables table in there with Ians parameter ranges.

My poor paraphrasing of what I think Ian said about the sequential scenarios. The starting conditions from after one insecticide has 'failed' are the same as starting from scratch for the other insecticide because the insecticides to do effect selection on the other locus.

moving malaria_code_andy.r into a function resistanceMaster() exposed a couple of variable scoping problems.

plus now the graphs look different and I get : Warning message: In data.matrix(mat) : NAs introduced by coercion

this is from createInputMatrix() input <- make.matrix(input, input$Input)

this might be something to do with the location of "input.parameters.csv" which I need to sort ...

devtools commands for setting up a package : create requires that the directory doesn't exist yet; it will be created. setup assumes an existing directory from which it will infer the package name.

probably need to set rstudio rg to false because mine is already a rs project.

setup(rstudio=FALSE)

Then in RStudio I had to set project options, build tools to package (so maybe I should have done just done devtools::setup())

but on build : ERROR: The build directory does not contain a DESCRIPTION file so cannot be built as a package. Build directory: C:/rsprojects/resistance

but there is a DESCRIPTION there ???

devtools::load_all(".") Error: Line starting 'person("Bethany", "L ...' is malformed! sorted, just needed tab before 2ry authors

package does build now

potential refactoring of code, if i think carefully, using arrays I should be able to replace this :

    if(calibration==103){       ## no selection calibration
        ## male
        # SS1
        fs.m.SS1SS2 <- f.m.SS1SS2
        fs.m.SS1RS2 <- f.m.SS1RS2
        fs.m.SS1RR2 <- f.m.SS1RR2
        # RS1
        fs.m.RS1SS2 <- f.m.RS1SS2 
        fs.m.RS1RS2_cis <- f.m.RS1RS2_cis
        fs.m.RS1RS2_trans <- f.m.RS1RS2_trans
        fs.m.RS1RR2 <- f.m.RS1RR2
        # RR2 
        fs.m.RR1SS2 <- f.m.RR1SS2
        fs.m.RR1RS2 <- f.m.RR1RS2
        fs.m.RR1RR2 <- f.m.RR1RR2

        ## female
        # SS1
        fs.f.SS1SS2 <- f.f.SS1SS2
        fs.f.SS1RS2 <- f.f.SS1RS2
        fs.f.SS1RR2 <- f.f.SS1RR2
        # RS1
        fs.f.RS1SS2 <- f.f.RS1SS2 
        fs.f.RS1RS2_cis <- f.f.RS1RS2_cis
        fs.f.RS1RS2_trans <- f.f.RS1RS2_trans
        fs.f.RS1RR2 <- f.f.RS1RR2
        # RR2 
        fs.f.RR1SS2 <- f.f.RR1SS2
        fs.f.RR1RS2 <- f.f.RR1RS2
        fs.f.RR1RR2 <- f.f.RR1RR2

with something like : fs <- f

where arrays are set up as :

#to create an array a[sex][locus1][locus2]
namesLoci <- c('SS','RS','RR')
locus1 <- paste0(namesLoci,'1')
locus2 <- paste0(namesLoci,'2')
sex <- c("F","M")

dimnames1 <- list( sex=sex, locus1=locus1, locus2=locus2 )
dim1 <- sapply(dimnames1, function(x) length(x))
a <- array(0,dim=dim1, dimnames=dimnames1)
a
length(a)
#to access one element
a['M','SS1','RR2']
#to check that males total 1
sum(a['M',,])

Just need to work out how best to deal with cis & trans. This would be a less pleasing alternative

namesLoci <- c('SS1SS2','SS1RS2','SS1RR2',
               'RS1SS2','RS1RS2','RS1RR2',
               'RR1SS2','RR1RS2','RR1RR2')

#adding cis & trans
namesLoci <- c('SS1SS2','SS1RS2','SS1RR2',
               'RS1SS2','RS1RS2cis','RS1RS2trans','RS1RR2',
               'RR1SS2','RR1RS2','RR1RR2')

sex <- c("F","M")

dimnames1 <- list( sex=sex, namesLoci=namesLoci )

dim1 <- sapply(dimnames1, function(x) length(x))
a <- array(0,dim=dim1, dimnames=dimnames1)
a
length(a)
#to access one element
a['M','SS1SS2']
#to check that males total 1
sum(a['M',])
str(a)

19/3/15 fri 9.45-14 4.25hrs 3.30-7 3.5hrs 7.75hrs

checking whether I can run without the input file. Fixed now.

#require(devtools)    
#install_github('AndySouth/resistance') 
library(resistance)
resistanceMaster(params.csv = FALSE)

This is what the fitness scores that can be saved for each scenario as *two-locus_fitness-scores.csv look like :

    -,- a,- A,- -,b -,B a,b A,B A,b a,B
SS1SS2  1   0   0   0   0   0   0   0   0
SS1RS2  1   0   0   0   0   0   1.86E-05    0   0
SS1RR2  1   0   0   0   0   0   0.1161  0   0
RS1SS2  1   0   0   0   0   0   0   0   0
RS1RS2  1   0   0   0   0   0   2.13E-05    0   0
RS1RR2  1   0   0   0   0   0   0.132913    0   0
RR1SS2  1   0   0   0   0   0   0   0   0
RR1RS2  1   0   0   0   0   0   3.44E-05    0   0
RR1RR2  1   0   0   0   0   0   0.215   0   0

Starting resistSimple() to be able to run single simple Scenarios.

20/3/15 sat 10.15-2.15 4hrs

recent popgen hack Doesn't seem that useful for this project.

Created setInputOneScenario to enable setting inputs for a single scenario + sensible defaults + later will allow customised scenarios e.g. setInputCurtisFig1 + will reduce code volume and repetition in createInputMatrix

Good progress : check out e.g. resistSimple(P_1=0.01,P_2=0.3)

23/3/15 mon 9.45-11.45 2hrs 5-8 5hrs

created a modified plot allele freq function to be able to see overlapping lines

Default settings :

resistSimple()

Reducing dominance coefficient of locus1 from 1 shifts curves to right & separates locus 1 & 2

resistSimple(h.RS1_A0=0.17)

But then setting dominance coefficient of locus2 to anything less than 0.7 grounds the curves (actually by pushing them further to right).

resistSimple(h.RS1_A0=0.17,h.RS2_0B=0.7)

Reducing the selection coefficient for locus1 in A (from 1) pushes curves to right, but strangely doesn't separate locus 1 & 2 ??

resistSimple(s.RR1_A0=0.17)

Put variable names into the suggested sensitivity analysis table.

Thinking about creating a shiny app(s) within the resistance package. Advice from RStudio

Put your Shiny application directory under the package’s inst directory, then create and export a function that contains something like this:

shiny::runApp(system.file('appdir', package='packagename'))

I might be able to use something like this to put inputs in columns.

Or perhaps better to keep it to a single column to start ?

Starting frequency of resistance Locus1 Locus2

Exposure to insecticide (same for M&F & in Curtis for both insecticides)

Fitness of susceptibles in presence of insecticide. Insecticide1 Insecticide2

Dominance of resistance Locus1 Locus2

Selective advantage of resistance Locus1 Locus2

It could perhaps be 2 columns, Locus1 & Locus2

Or to fit better with widget titles, 2 rows (l1&2) with 5 columns.

24/3/15 tue 9.45-2.15 4.5hrs

created function to run shiny UI from package, and add it to readme

there was a bug that dominance had no effect, because I was setting h.RS1_00 rather than h.RS1_A0 & h.RS1_0B

Cool! Seems to be working well now.

Sent message to Ian & Beth to test installing from Github and the shinyUI.

25/3/15 weds 12.30-13.45 1.25 2.30-4 1.5 2.75

created a time record

Barbosa paper methods: The analysis was performed using R & package lhs. It does not allow for the specification of each variable distribution beforehand, so sampling was performed assuming a uniform distribution. Once the sample was generated, the uniform sample from a column (variable) could be transformed to the required distribution (Table 3) by using quantile functions (using the qtriangle comand in R).

A data set of 3,000 replications was generated, with random parameter sets. Ten replicates of this procedure were performed as suggested in [24] to investigate the predictive precision of model using LHS as the sampling method. This was achieved by analysing each replicate separately and verifying that results were consistent across ten replicates.

...

Results included a counter-intuitive outcome that the inclusion of a synergist could lead to an increase in the rate of the spread of resistance (i.e. y > 1). Further investigation of this result was pursed by performing a logistic regression with a binary dependent variable (1 if y > 1 and 0 if y < 1), therefore quantifying how changes in the parameters values affect the odds of getting the unexpected outcome y > 1.

...

Classification trees are used to predict membership of cases in the classes of a categorical dependent variable (1 if y > 1 or 0 if y < 1) from their input parameters and were implemented using an algorithm that grows a binary tree [27]. At each internal node in the tree, a test is applied to the input parameters to identify the binary distinction which gives the most information about the class membership.

These are the references that refer to the classification trees : 27. Breiman L: Classification and Regression Trees. Belmont (Calif.): Wadsworth International Group; 1984: 358. 28. Therneau T, Atkinson EJ: An introduction to recursive partitioning using the RPART routines 2011:1–67. [r.789695.n4.nabble.com/attachment/3209029/0/zed.pdf].

seems that recursive partitioning using rpart is the R software used.

rpart package Recursive partitioning for classification, regression and survival trees. An implementation of most of the functionality of the 1984 book by Breiman et al.

rpart intro vignette updated version of the reference.

First example from the vignette (p10). This works on a binary response yes/no. method = 'class' is for categorical.

library(rpart)
#uses stagec, a dataframe
str(stagec)
#creates a factor for response var to improve labelling
progstat <- factor(stagec$pgstat, levels = 0:1, labels = c("No", "Prog"))
#fit the model by adding the columns
cfit <- rpart(progstat ~ age + eet + g2 + grade + gleason + ploidy, data = stagec, method = 'class')
plot(cfit) #plots tree
text(cfit) #labels tree

So that seems to suggest that I'll need to output a dataframe with one row per simulation, columns for outputs (generations until resistance reached) and inputs. Will need to run it by Ian exactly what the outputs are.

p25 in vignette has an example with multiple categories. Seems there are methods that work on continuous data too, which might be our case ...

30/3/15 mon 10.30 - 11.30 1hr

RSTMH digital methods conference. KDR insecticide resistance gene. Turning 'big' data about how resistance spreads into epidemiological models and then into DSS about control and eradication is very complex. Malaria genomics: tracking a diverse and evolving parasite population Professor Dominic Kwiatkowski, University of Oxford and Wellcome Trust Sanger Institute Paul Ben Allistere, Panoptes web app https://www.malariagen.net/apps/ag1000g/phase1-AR2/index.html#genomebrowser

2/4/15 1hr

emails from Ian & Marlize Coleman about insecticide resistance gaming.

Ian : Marlize Coleman has funding from B&MGF to build computer games of insecticide deployment policies and wants to tie them to the outputs of various IR models.

Marlize : At the moment we need someone to look at current mathematical models that have any relevance to insecticide resistance management and determine if there is any potential to use them, or just components of the model/s, to create scenarios for a IR game. We planning to start development of the game in July so we need to get a game scope sorted pretty soon. It will also be important for this person to communicate and discuss these scenarios/simulations with the game development team to ensure that they have a good grasp of the backend mathematical detail.

We want to expand on this version by adding a bit more complexity in the backend and we’d like to create a bit more flexibility e.g. choice of vectors, disease prevalence/incidence, interventions, specific learning outcomes of interest etc.

If you are interested, we can discuss more detail. Probably 3 months full time or 6 months 2-3 days a week. You might be better placed to tell me how much effort will be required. Perhaps more time in the first few months and less during game development for assistance to developers.

Me to Ian : Will be good to talk to you about potential IR models after I've spoken to Marlize. Do you imagine that a simplified version of Curtis might be useful for her ?

Ian : I definitely think the 2locus Curtis model should be included. I don't think it necessarily needs to be simple. We may restrict the user to certain parameter values then preload the results from all possible permutations. We can discuss later. BW Ian

13/4/15 mon 9.30 - 13.30 4hrs 2.30-6 3.5hrs 7.5hrs

looking at new serious gaming project ... & can offset these hours on that

publish shinyCurtis1 to shinyapps.io to enable sharing with Marlize using publish button from RStudio worked well It gives this warning on installation, but then does work from web. Warning message: In FUN(c("shiny", "resistance")[[2L]], ...) : Package 'resistance' not available in repository or locally

https://andysouth.shinyapps.io/shinyCurtis1/

Options : 1. run R code from within the game 1. translate model modules into the game language 1. save model outputs as input files that can modify game behaviour

Points to discuss with Marlize :

I am mostly committed until end of June, but might be able to shuffle. If we know what they want from the design, we could work on the model behind to produce this after the game design has been started.
might they want to run actual model code or translate it ?
I don't know how much info there is out there on the effect on resistance of different insecticide applications, but am happy to research in discussion with Ian, and produce working versions.
I would be a good intermediary between the biologists and the coders, as I'm somewhere in between myself !

Imperial malaria models : http://www1.imperial.ac.uk/resources/27F31FE6-4CA6-4C97-85BD-58B342BC2CDA/usingthemalariatoolssoftwareforesp.pdf http://www1.imperial.ac.uk/malariamodelling/toolsdata/tools/ On download I got an error msg saying : "The program can't start because VCOMP120.DLL is missing from your computer. Try reinstalling the program to fix"

http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1000324 Reducing Plasmodium falciparum Malaria Transmission in Africa: A Model-Based Evaluation of Intervention Strategies

emailed them to see about the error

14/4/15 tue 11.45 - 14.15 2.5hrs 3-5.30 2.5 5hrs

can I create a plotting method that can accept multiple scenarios ? initially it could just plot all scenarios as additional lines in the existing colours for locus & sex use plotallele.freq.andy might need to modify to accept more than one matrix of results outputs
I thought tricky bit might be to set plot bounds based on multiple lines (but actually the y is already hardcoded from 0-1 for allele frequency). Might need to be a bit careful with x because different scenarios can have different numbers of generations.

done : fixed why resistanceMaster() currently fails. I suspect due to input scenario calibration combination, it used to work ??

Error in 1:ddt_cutoff : result would be too long a vector In addition: Warning messages: 1: In data.matrix(mat) : NAs introduced by coercion 2: In min(which((amat[, r1col]) > 0.5)) : no non-missing arguments to min; returning Inf

the error comes from plotcurtis_f2() I suspect it's because the input.parameters.csv isn't the one compatible with producing the curtis plot. plotcurtis_f2() needs results matrices from 3 scenarios: 1 hch, 2 ddt, 3 combined. Yet the current csv I have in the package seems to have 7 scenarios Input,Curtis Fig 2 Comb - calibration,DDT high,DDT high,DDT high,HCH high,HCH high,HCH high
Why did it work before then ?? Aha! it's because in createInputMatrix() this sets up the 3 Curtis scenarios if an input file is not specified : if( !params.csv && calibration == 1012 ){ input <- matrix( ncol=3, nrow=52 ) colnames(input) <- c("B (HCH)", "A (DDT)","Combination") So maybe I should have no input csv as the default.

setting params.csv=FALSE as the default for resistanceMaster() sorted it.

Back to plotallele.freq.andy, I might be able easily to modify it to accept a list of matrices for multiple scenarios, as well as it's current single scenario mode.

this gives me the number of generations per scenario sapply(listOut$results,nrow) 70 70 160 max_gen <- max( sapply(listOut$results,nrow) )

done : create a plotting method that can accept multiple scenarios

now try to use it to plot a parameter range. To run multiple scenarios one way would be to call setInputOneScenario() repeatedly to create single scenarios & then rbind these together into an input object that I can pass to runModel()

input <- NULL
vals <- seq(0,1,0.2) #create a range of input values
for(i in vals)
{
 inputOneScenario <- setInputOneScenario( h.RS1_A0=i )  
 input <- cbind(input, inputOneScenario)
}

listOut <- runModel(input)
plotallele.freq.andy(listOut)

This is getting close to allowing the user to specify an input parameter and what range of that parameter you want to plot. I could then potentially create a UI allowing the user to do that. Would it be useful to put the above code into a function ? Actually the above code is very close to what I would need for the sensitivity analysis.

I could create a setInputMultiScenario() function that accepts a vector of values for a param, and creates an input object with all other params set to their default. It could maybe even accept multiple args, but then would need to return the n*n scenarios. Also this is going away from the way that Ian suggested doing it by smapling from a distribution, maybe I should try that route first ?

In that case do I want to randomly change all params at once ? Also might be good to get it to output a file at the end of each 100 runs, so that if it does crash during a long run, everything is not lost. These could just be RDA files saved from the listOut object. Do I want it to output listOut() each time or will I process so that the outputs are smaller ? (Might be better to keep the raw outputs to save any confusion later.)

The sensitivity analysis function may be called a bit like this :

#specify ranges for parameters, if a range is not specified the default will be used
#could offer option to setother params at diff fixed values
#this would only accomodate uniform distributions, OK for now
#think about random seeds later
#be careful about some params which must total one
input <- setInputSensiScenarios( n=100, h.RS1_A0=c(0,1) )

Am I reinventing the wheel here, can anything else help me with this ?

15/4/15 wed 1.30-2.30 1hrs 3.45-6.15 2.5 3.5hrs

setInputSensiScenarios function to create sensitivity analysis scenarios, probably by creating a large input object.

16/4/15 thurs 10-14 4hrs 2.30-5 2.5 6.5hrs

To get input params into 1 row per scenario I can transpose, but the result is a matrix and has no names. But I need to find a way of getting column names set.

input <- setInputSensiScenarios( nScenarios=3, P_1=c(0,0.5), P_2=c(0,0.5) )
tinput <- t(input)
str(tinput)
#num [1:3, 1:52] 100 100 100 100 100 100 1 1 1 0 ...

putting names into setInputOneScenario() fixed this.

Having a problem with sensiAnalysis1() Error in eval(..1) : ..1 used in an incorrect context, no ... to look in

This may be because setInputSensiScenarios() expects a range for each argument, whereas setInputOneScenario expects a single value for the same named args. (but I don't really think it is this)

Also stackoverflow pointed that using c for a varname (and there is one in resistance) is dangerous because R can expect the function.

replaced c with recomb_rate for "Recombination Rate" to avoid R conflicts with c() previous ... error still remained

Aha! i think problem is that I specify defaults in setInputSensiScenarios when I don't need to (and that stops them making it into the ...)

cool yes, getting there.

22/4/15 jury service 11.30 - 12.30 1 hr

out <- sensiAn1( nScenarios = 10, h.RS1_A0=c(0.1,1))

24/4/15 4-5 1hr

27/4/15 9.15-13.15 4hrs 1.45-2.45 5hrs

trying to test rpart in sensiAn1 getting : cfit <- rpart(res ~ h.RS1_A0 + h.RS2_0B, data = forCT, method = 'class') Error in cbind(yval2, yprob, nodeprob) : number of rows of matrices must match (see arg 2)

This was because there were some NAs in the results that can from where a resistance frequency of 0.5 was not reached. so I temp converted them to 999

Now get : Error in plot.rpart(cfit) : fit is not a tree, just a root

Is this because I haven't used enough samples or because it doesn't work with just 2 predictors ?

Trying with more samples, still get same error :

inputAndResults <- sensiAn1( 100, h.RS1_A0=c(0.1,1), h.RS2_0B=c(0.1,1) )

more samples and more inputs I do get a tree inputAndResults <- sensiAn1(500, h.RS1_A0=c(0.1,1), h.RS2_0B=c(0.1,1), s.RR1_A0=c(0.2,1), s.RR2_0B=c(0.2,1))

Some of susanas code for finessing the tree : tree->rpart(outcome ~ h_n+ h_o+h_I+s_o+s_I+φ_o+φ_I+α_o+α_I+α_fn+α_mo+α_mI+α_mn+z+beta_f+β_m ,method='class',data=working_example,control=rpart.control(minsplit=50))

control=rpart.control(minsplit=50)

minsplit the minimum number of observations that must exist in a node in order for a split to be attempted.

then prune the tree to avoid overfitting the data

pruned_tree<- prune(tree, cp=tree$cptable[which.min(tree$cptable[,"xerror"]),"CP"])

28/4/15 10-14.30 4.5hrs 3-6 7.5hrs

reading Ians recent manuscript update. I think this from p13 is key on what I need to do with the analysis :

The simulation output was the ratio of the time (in generation) for resistance to the ‘new’ insecticide to reach a given allele threshold frequency when the ‘new’ insecticide was deployed as combination, compared to the time taken to reach the same threshold if deployed on its own; a ratio >1 indicates that combination is better. Five threshold frequencies were investigated: 0.05, 0.1, 0.2, 0.5, 0.8.

In order to do that I'd need to repeat each scenario for a single & combined insecticide treatment. Single is just the new rarer insecticide, Combined also has an older insecticide to which initial resistance is more common.

Searching for the Curtis paper. Cited by 181 papers according to Google. https://www.google.co.uk/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=%22theoretical%20models%20of%20the%20use%20of%20insecticide%20mixtures%22

https://books.google.co.uk/books?id=ooHjBwAAQBAJ&pg=PA177&lpg=PA177&dq=%22theoretical+models+of+the+use+of+insecticide+mixtures%22&source=bl&ots=b3CPTQc5LL&sig=FCB5MCvX1xUjtfgog1z_dq1pH1Y&hl=en&sa=X&ei=bXw_VbriFoWpsAGnvYG4Ag&ved=0CEEQ6AEwBA#v=onepage&q=Curtis&f=false

Pesticide Resistance in Arthropods (1990) edited by Richard Roush, Bruce E. Tabashnik

these 2 chapters look useful : Roush, R. T., and J. C. Daly. 1990. The role of population genetics in resistance research and management, pp. 97-152.In: Pesticide resistance in arthropods. R. T. Roush and B. E. tabashnik (Eds.) Chapman and Hall, New York.

Tabashnik, B. E. 1990. Modeling and evaluation of resistance management tactics, pp. 153-182. In: Pesticide resistance in arthropods. R. T. Roush and B. E. Tabashnik (Eds.) Chapman and Hall, New York.

page 131 talks about insecticide mixtures. Refers to some modelling by Roush (1989a)

Roush, R. T. 1989. Designing resistance management programs: How can you choose? Pestic. Sci. 26:423-441.

p467 in Insect Resistance Management: Biology, Economics, and Prediction edited by David W. Onstad (2013) seems to have some simulations addressing similar issues to us.

https://books.google.co.uk/books?id=6hp384ZH0_kC&pg=PA453&dq=%22Modeling+and+evaluation+of+resistance+management+tactics%22&source=gbs_toc_r&cad=4#v=onepage&q=%22Modeling%20and%20evaluation%20of%20resistance%20management%20tactics%22&f=false

Now have sorted passing user args to rpart in sensiAn1(). It does produce reasonable looking trees just by e.g.

inputAndResults <- sensiAn1(100, h.RS1_A0=c(0.1,1), h.RS2_0B=c(0.1,1), s.RR1_A0=c(0.2,1), s.RR2_0B=c(0.2,1))

29/4/15 10.30-12.30 2hrs 3.15-5.15 4hrs

e.g. W doesn't have cis/trans I could refactor

        # relaxed selection fitnesses
        ## Males
        W.m.SS1SS2 <- 0.1 
        W.m.SS1RS2 <- 0.1
        W.m.SS1RR2 <- 0.1

        W.m.RS1SS2 <- 0.1
        W.m.RS1RS2 <- 0.1  
        W.m.RS1RR2 <- 0.1  

        W.m.RR1SS2 <- 0.1 
        W.m.RR1RS2 <- 0.1 
        W.m.RR1RR2 <- 0.1  

        ## Female
        W.f.SS1SS2 <- 0.1
        W.f.SS1RS2 <- 0.1
        W.f.SS1RR2 <- 0.1

        W.f.RS1SS2 <- 0.1
        W.f.RS1RS2 <- 0.1 
        W.f.RS1RR2 <- 0.1 

        W.f.RR1SS2 <- 0.1
        W.f.RR1RS2 <- 0.1 
        W.f.RR1RR2 <- 0.1
# to :
#to create an array a[sex][locus1][locus2]
namesLoci <- c('SS','RS','RR')
locus1 <- paste0(namesLoci,'1')
locus2 <- paste0(namesLoci,'2')
sex <- c("f","m")

dimnames1 <- list( sex=sex, locus1=locus1, locus2=locus2 )
dim1 <- sapply(dimnames1, function(x) length(x))
W <- array(0,dim=dim1, dimnames=dimnames1)

#to fill al elements as above
W[] <- 0.1
#check length
length(W)
#to access one element
W['m','SS1','RR2']
#to check that males total 1
sum(W['m',,])

working on refactoring variables to arrays, made flexible createArray() function. good progress started runModel2() to contain the refactoring

30/4/15 9.45-13.30 3.75hrs 2.15-5.30 3.25hrs 6-7 8hrs

Need to be careful about different W fitness variables W.RR1_00 : [locus,niche] single locus (Wlocus) W.RR1SS2_0b : [locus1, locus2, niche] niche (Wniche) W.m.SS1SS2 : [sex, locus1, locus2] sex (Windiv)

Once I have array refactoring working by .

    a <- createArray( sex=c('m','f'), niche1=c('0','a','A'), niche2=c('0','b','B') )
    Wniche <- createArray( locus1 = c('SS1','RS1','RR1'), locus2 = c('SS2','RS2','RR2'), niche1=c('0','a','A'), niche2=c('0','b','B') )    
    Windiv <- createArray( sex=c('m','f'), locus1 = c('SS1','RS1','RR1'), locus2 = c('SS2','RS2','RR2') )

I should be able to replace

    W.m.SS1SS2 <- (a.m_00 * W.SS1SS2_00) + 
      (a.m_a0 * W.SS1SS2_a0) + (a.m_A0 * W.SS1SS2_A0) + 
      (a.m_0b * W.SS1SS2_0b) + (a.m_0B * W.SS1SS2_0B) + 
      (a.m_ab * W.SS1SS2_ab) + (a.m_AB * W.SS1SS2_AB) + 
      (a.m_Ab * W.SS1SS2_Ab) + (a.m_aB * W.SS1SS2_aB)

with

    Windiv['m','SS1','SS2'] <- sum( a['m',,] * Wniche['SS1','SS2',,])

To prove that the above works

    #set constant
    a['m',,] <- 2 
    #set variable
    Wniche['SS1','SS2',,] <- 1:9
    Wniche['SS1','SS2',,]

    #showing that each element gets multiplied by 2
    a['m',,] * Wniche['SS1','SS2',,]

    #show result of sum
    sum( a['m',,] * Wniche['SS1','SS2',,])

    Windiv['m','SS1','SS2'] <- sum( a['m',,] * Wniche['SS1','SS2',,])

Gives a result of 90 for just mSS1SS2 which is correct

, , locus2 = SS2

   locus1
sex SS1 RS1 RR1
  m  90   0   0
  f   0   0   0

, , locus2 = RS2

   locus1
sex SS1 RS1 RR1
  m   0   0   0
  f   0   0   0

, , locus2 = RR2

   locus1
sex SS1 RS1 RR1
  m   0   0   0
  f   0   0   0

Or indeed might be able to replace the whole 120 lines of code from 510-630 with : No these give:

#Error in a[, , ] * Wniche[, , , ] : non-conformable arrays
    Windiv['m',,] <- sum( a['m',,] * Wniche[,,,])
    Windiv['f',,] <- sum( a['f',,] * Wniche[,,,])
    #or even just
    Windiv[,,] <- sum( a[,,] * Wniche[,,,])

Just needs a little more thought ... It's because a needs to be applied repeatedly to each genotype. Maybe worry about that later ..

Want to develop a sequential testing process, so that I can compare runModel() and runModel2(). Maybe I could use testthat for it ?

Just need to do this to start : devtools::use_testthat()

see testRefactoring() which does things like this :

input <- setInputOneScenario()
tst <- runModel(input)
tst2 <- runModel2(input)
input <- setInputSensiScenarios(20, h.RS1_A0=c(0.1,1), h.RS2_0B=c(0.1,1), s.RR1_A0=c(0.2,1), s.RR2_0B=c(0.2,1))
tst <- runModel2(input, produce.plots=FALSE)
tst2 <- runModel(input, produce.plots=FALSE)
identical(tst,tst2)

So now I can keep developing runModel2() and use Ctrl T to test I'm not breaking it.

Think, does my Wniche contain too many elements at 81 ? i.e. does it contain impossible combinations ? I think I'm OK Beths code fills 9*9 groups of individual variables.

W.RR1SS2_0b ... *81 3*3*3*3 to: Wniche[locus1, locus2, niche1, niche2]

I may need to add a genotype option to createArray() to cope with f that needs cis & trans

If I convert niche_0B to niche[niche1,niche2] I can simplify further the calculation of Two genotype fitnesses in two insecticide Niche by having a loop that goes through each niche.

Coolio! reduced 250 lines of code to ~ 20 with this

    ## Two genotype fitnesses in two insecticide Niche ##

    #!r to replace 250+ lines below
    for( niche1 in dimnames(Wniche)$niche1)
    {
      for( niche2 in dimnames(Wniche)$niche2)
      {
        #if this niche toggled off set fitness to 0
        if (niche[niche1,niche2] == 0)
        {
          Wniche[,,niche1,niche2] <- 0
        } else{
          #otherwise set fitness to product of the 2 loci
          for( locus1 in dimnames(Wniche)$locus1)
          {
            for( locus2 in dimnames(Wniche)$locus2)
            {    
              Wniche[locus1,locus2,niche1,niche2] <- Wloci[locus1,niche1,niche2] * Wloci[locus2,niche1,niche2]
            }
          }          
        }
      }
    }

Trying to get fitnesses from my array into the existing output object fbn at the end of runModel2

   -,- a,- A,- -,b -,B a,b A,B A,b a,B

SS1SS2 0 0 0 0 0 0 0 0 0 SS1RS2 NA NA NA NA NA NA NA NA NA SS1RR2 NA NA NA NA NA NA NA NA NA RS1SS2 NA NA NA NA NA NA NA NA NA RS1RS2 NA NA NA NA NA NA NA NA NA RS1RR2 NA NA NA NA NA NA NA NA NA RR1SS2 NA NA NA NA NA NA NA NA NA RR1RS2 NA NA NA NA NA NA NA NA NA RR1RR2 NA NA NA NA NA NA NA NA NA

STill not quite working ...

1/5/15 9-14.30 5.5hrs 3.15-6 2.75 8.25hrs

Fitness matrix different in my refactored version. But do I know which is correct ?

Browse[2]> fbn -,- a,- A,- -,b -,B a,b A,B A,b a,B SS1SS2 1 0 0 0 0 0 0 0 0 SS1RS2 1 0 0 0 0 0 0 0 0 SS1RR2 1 0 0 0 0 0 0 0 0 RS1SS2 1 0 0 0 0 0 0 0 0 RS1RS2 1 0 0 0 0 0 1 0 0 RS1RR2 1 0 0 0 0 0 1 0 0 RR1SS2 1 0 0 0 0 0 0 0 0 RR1RS2 1 0 0 0 0 0 1 0 0 RR1RR2 1 0 0 0 0 0 1 0 0 Browse[2]> fbn2 -,- a,- A,- -,b -,B a,b A,B A,b a,B SS1SS2 0 0 0 0 0 0 0 0 0 SS1RS2 0 0 0 0 0 0 0 0 0 SS1RR2 0 0 0 0 0 0 0 0 0 RS1SS2 0 0 0 0 0 0 0 0 0 RS1RS2 0 0 0 0 0 0 0 0 0 RS1RR2 0 0 0 0 0 0 0 0 0 RR1SS2 0 0 0 0 0 0 0 0 0 RR1RS2 0 0 0 0 0 0 0 0 0 RR1RR2 0 0 0 0 0 0 0 0 0

Aha difference is because Wniche just has 0s in, may be a problem earlier in the program. Aha issue is that Wloci is not read in yet.

Reading in of Wloci can also be refactored, but I first need to read in h,z,phi & s as arrays.

Then I can try to refactor this :

    ## Calculated fitnesses ####

    # absence of insecticide
    ## fitness of SS in absence of insecticide is entered above as a parameter
    W.RS1_00 <- 1 - (h.RS1_00 * z.RR1_00)
    W.RR1_00 <- 1 - z.RR1_00

    W.RS2_00 <- 1 - (h.RS2_00 * z.RR2_00)
    W.RR2_00 <- 1 - z.RR2_00

    # low levels of insecticide a
    W.SS1_a0 <- 1 - phi.SS1_a0
    W.RS1_a0 <- W.SS1_a0 + (h.RS1_a0 * s.RR1_a0)
    W.RR1_a0 <- W.SS1_a0 + s.RR1_a0

    # high levels of insecticide A
    W.SS1_A0 <- 1 - phi.SS1_A0
    W.RS1_A0 <- W.SS1_A0 + (h.RS1_A0 * s.RR1_A0)
    W.RR1_A0 <- W.SS1_A0 + s.RR1_A0

    # low levels of insecticide b
    W.SS2_0b <- 1 - phi.SS2_0b
    W.RS2_0b <- W.SS2_0b + (h.RS2_0b * s.RR2_0b)
    W.RR2_0b <- W.SS2_0b + s.RR2_0b

    # high levels of insecticide B
    W.SS2_0B <- 1 - phi.SS2_0B
    W.RS2_0B <- W.SS2_0B + (h.RS2_0B * s.RR2_0B)
    W.RR2_0B <- W.SS2_0B + s.RR2_0B

h only has 6 values 2*3. Because it's only for heterozygous loci & niches.

    # h = dominance coefficient
    h.RS1_00 <- input[32,i]
    h.RS1_a0 <- input[33,i]
    h.RS1_A0 <- input[34,i]

    h.RS2_00 <- input[35,i]
    h.RS2_0b <- input[36,i]
    h.RS2_0B <- input[37,i]

Perhaps best to store as h[locusNum,niche1,niche2]

This nearly works, but includes un-needed homozygous niches aa etc createArray( loci=c('RS1','RS2'), niche1=c('0','a','A'), niche2=c('0','b','B') )

I could add a niches arg to createArray() createArray( loci=c('RS1','RS2'), niches=c('00','a0','A0') )

BUT no even that doesn't work. Because it only needs to contain the insecticide niche associated with the locus.

Actually it just needs : h[locusNum, nicheLevel] : where insecticide is no, lo, hi ... seeing how that works.

and what about z - fitness cost of resistance allele in no insecticide z only has 2 values z.RR1_00 <- input[42,i] z.RR2_00 <- input[43,i] z[locusNum] Because by definition it is the fitness cost of one allele in absence of the corresponding insecticide. (so RR1_00 is a little misleading z.R1_0 more like it)

phi - fitness of SS in each insecticide/concentration just 4 values phi.SS1_a0 <- input[26,i] phi.SS1_A0 <- input[27,i]

phi.SS2_0b <- input[28,i]
phi.SS2_0B <- input[29,i]

phi[locusNum, nicheLevel] where nicheLevel is lo, hi

I have Wloci being 54

str(Wloci) num [1:6, 1:3, 1:3] 0 0 0 0 0 0 0 0 0 0 ... - attr(*, "dimnames")=List of 3 ..$ loci : chr [1:6] "SS1" "RS1" "RR1" "SS2" ... ..$ niche1: chr [1:3] "0" "a" "A" ..$ niche2: chr [1:3] "0" "b" "B" length(Wloci) [1] 54

but actually I think it only needs 1 niche that corresponds to the chosen locus. for locus1 it only has as and locus2 it only has bs

look at how Wloci is used to think about how best to structure

This is where it is used Wniche[locus1,locus2,niche1,niche2] <- Wloci[locus1,niche1,niche2] * Wloci[locus2,niche1,niche2] I think I need to change from Wloci[locus1,niche1,niche2] * Wloci[locus2,niche1,niche2] to Wloci[locus1,niche1] * Wloci[locus2,niche2]

I might want to change niche names to no,lo,hi throughout. Having them different for each locus (0aA & 0bB) causes the code difficulty.

But that is a bigger change making it more difficult to check that the code logic remains the same.

A temporray solution could be to have a lookup from nicheLevel to niche. e.g. no,lo,hi to 0,a,A & 0,b,B for locus1 & 2 respectively

Later maybe Wloci[(locusNum),allele=(SS,SR,RR),exposure=(no,lo,hi)] for now Wloci[loci=(SS1,SR1,RR1,SS2,SR2,RR2),exposure=(no,lo,hi)]

s selection coeeficient : just 4 values # s = selection coefficient s.RR1_a0 <- input[38,i] s.RR1_A0 <- input[39,i]

s.RR2_0b <- input[40,i]
s.RR2_0B <- input[41,i]

It's the selection coefficient of resistance in presence of the corresponding insecticide (at lo or hi) s[locusNum,exposure] but exposure is only lo or hi. Would a -ve value here for no be the same as z (the fitness cost of resistance in absence of insecticide ?)

cool I think I have generic createArray2 function working

I'm getting closer to fbn & fbn2 being the same

Differences in fbn2: 1) The 00 column has 0s, in fbn it is all 1s 2) Column aB has the values that should be in AB

e.g. SS1RR2 W.SS1RR2_00 Wniche['SS1','RR2','0','0']

Wloci[locus2,exposure2] == 0 but W.RR2_00 == 1

aha think due to typo here : Wloci[ paste0('RR',locusNum), 'no'] <- 1 - z[locusNum]

hurrah fixed first problem, now on to 2)

Wniche looks correct So i think it may be due to ordering of columns in fbn Which goes AB, Ab, aB, where I think I assume aB, Ab, AB

Wniche['RR1','RR2',,] niche2 niche1 0 b B 0 1 0 0 a 0 0 0 A 0 0 1

as.vector( Wniche['RR1','RR2',,]) [1] 1 0 0 0 0 0 0 0 1

It may be tricky for me to get output in exactly the same order as Beths. How can I use the niche names to name the output matrix ?

Difference in orders colnames(fbn) <- c("-,-", "a,-", "A,-", "-,b", "-,B", "a,b", "A,B", "A,b", "a,B") my order is 00 a0 A0 0b ab Ab 0B aB AB

which is 1,2,3,4,6,8,5,9,7

I might just need to allow that the fitness list is different in my refactor test ?

Still how can I name the output columns from the data ? Fixed !!

Then I started to delete some of the old code, and found that it broke some things I didn't want to break yet.

Particularly this which suggests I need to get into the cis/trans issue soon:

W.bar.m <- Windiv['m'] W.bar.m <- (f.m.SS1SS2 * W.m.SS1SS2)

revert changes for now and come back to

5/5/15 10.30 - 14 3.5hrs 2.45-6.15 7hrs

removed refactored fbn code

sent update email to Ian & Beth

starting to look at cis/trans this is the first place they appear in the generation loop

# set genotype frequencies as variables
# from genotype frequency matrix generated above from initial value of P (freq. of R allele)
# f = frequency before selection

# male
f.m.SS1SS2 <- genotype.freq[1,]
f.m.SS1RS2 <- genotype.freq[2,]
f.m.SS1RR2 <- genotype.freq[3,]
f.m.RS1SS2 <- genotype.freq[4,]
f.m.RS1RS2_cis <- genotype.freq[5,]     ### cis
f.m.RS1RS2_trans <- genotype.freq[6,]   ### trans

How is genotype.freq created ?

# set up genotype matrix - records frequencies of each of the 9 two locus genotypes each generation
genotype <- matrix( nrow=max_gen, ncol=11 )
colnames(genotype) <- c("gen", "SS1SS2", "SS2RS2", "SS1RR2", 
                        "RS1SS2", "RS1RS2_cis", "RS1RS2_trans", "RS1RR2",
                        "RR1SS2", "RR1RS2", "RR1RR2")
## make.genotypemat function will use this data and make a matrix of the genotype frequencies
## frequencies of genotypes before selection - in HW equilibrium and same in male and female
## needs name of matrix and takes corresponding frequency of resistant allele in function call
genotype.freq <- make.genotypemat ( P_1, P_2 )

Within make.genotypemat, currently seems that cis & trans are always set to the same ??

## two forms of RS1RS2
mat[5,] <- ( loc1[2,]*loc2[2,] ) * 0.5 #cis
mat[6,] <- ( loc1[2,]*loc2[2,] ) * 0.5 #trans

#this is what genotype.freq looks like
                      Freq.
SS1SS2       9.960060e-01
SS1RS2       1.994006e-03
SS1RR2       9.980010e-07
RS1SS2       1.994006e-03
RS1RS2_cis   1.996002e-06
RS1RS2_trans 1.996002e-06
RS1RR2       1.998000e-09
RR1SS2       9.980010e-07
RR1RS2       1.998000e-09
RR1RR2       1.000000e-12

aarg createArray2() fails if you pass named variables to it, e.g. sex=sex rather than sex=c('m','f') seems like mget might solve, but : mget('sex', inherits=TRUE) Error: value for ‘sex’ not found

may be able to use this from SO get_args <- function () { as.list( match.call( def = sys.function( -1 ), call = sys.call(-1)) )[-1]

}

BUT from the commandLine it does seem to work ??

> sex2 = c('f','m')
> createArray2( sex = sex2 )
sex
f m 
0 0

Ahhh so someohow it does work if the variable is in the Global environment.

eventually by trial & error found this solution : #fails #dimnames1 <- lapply(listArgs,function(x){eval.parent(x,n=1)}) #works #eval.parent(listArgs$sex) #[1] "f" "m" #fails #dimnames1 <- lapply(listArgs,eval.parent)
#works!! n=3 but I'm not sure why !!! dimnames1 <- lapply(listArgs,function(x){eval.parent(x,n=3)})

11/5/15 1.30-5.45 4.25hrs
checking current testthat difference changing from expect_identical to expect_equal sorted it refactoring gamete calculations into createGametes() function

12/5/15 9.45-14 4.25hrs 2.30-5.45 3.25 7.5hrs

sorting random mating & maybe getting to nub of cis/trans ?

Beths random mating code :

      f.m.SS1SS2 <- 0
      f.m.SS1RS2 <- 0
      f.m.SS1RR2 <- 0

      f.m.RS1SS2 <- 0
      f.m.RS1RS2_cis <- 0           #RS1RS2
      f.m.RS1RS2_trans <- 0     #RS1SR2
      f.m.RS1RR2 <- 0

      f.m.RR1SS2 <- 0
      f.m.RR1RS2 <- 0 
      f.m.RR1RR2 <- 0

      # SS male with SS female
      f.m.SS1SS2 <- f.m.SS1SS2 + ( G.m.S1.S2 * G.f.S1.S2 )
      # SS male with SR female
      f.m.SS1RS2 <- f.m.SS1RS2 + ( G.m.S1.S2 * G.f.S1.R2 )
      # SS male with RS female
      f.m.RS1SS2 <- f.m.RS1SS2 + ( G.m.S1.S2 * G.f.R1.S2 )
      # SS male with RR female
      f.m.RS1RS2_cis <- f.m.RS1RS2_cis + ( G.m.S1.S2 * G.f.R1.R2 )

      # SR male with SS female
      f.m.SS1RS2 <- f.m.SS1RS2 + ( G.m.S1.R2 * G.f.S1.S2 )
      # SR male with SR female
      f.m.SS1RR2 <- f.m.SS1RR2 + ( G.m.S1.R2 * G.f.S1.R2 )
      # SR male with RS female
      f.m.RS1RS2_trans <- f.m.RS1RS2_trans + ( G.m.S1.R2 * G.f.R1.S2 )
      # SR male with RR female
      f.m.RS1RR2 <- f.m.RS1RR2 + ( G.m.S1.R2 * G.f.R1.R2 )

      # RS male with SS female
      f.m.RS1SS2 <- f.m.RS1SS2 + ( G.m.R1.S2 * G.f.S1.S2 )
      # RS male with SR female
      f.m.RS1RS2_trans <- f.m.RS1RS2_trans + ( G.m.R1.S2 * G.f.S1.R2 )
      # RS male with RS female
      f.m.RR1SS2 <- f.m.RR1SS2 + ( G.m.R1.S2 * G.f.R1.S2 )
      # RS male with RR female
      f.m.RR1RS2 <- f.m.RR1RS2 + ( G.m.R1.S2 * G.f.R1.R2 )

      # RR male with SS female
      f.m.RS1RS2_cis <- f.m.RS1RS2_cis + ( G.m.R1.R2 * G.f.S1.S2 ) 
      # RR male with SR female
      f.m.RS1RR2 <- f.m.RS1RR2 + ( G.m.R1.R2 * G.f.S1.R2 )
      # RR male with RS female
      f.m.RR1RS2 <- f.m.RR1RS2 + ( G.m.R1.R2 * G.f.R1.S2 )
      # RR male with RR female
      f.m.RR1RR2 <- f.m.RR1RR2 + ( G.m.R1.R2 * G.f.R1.R2 )

Beth only has 10 combinations (including cis/trans), this is because RS is usually same as SR, and in all the names R goes first.

genotypes <- c( "SS1SS2", "SS2RS2", "SS1RR2", 
                "RS1SS2", "RS1RS2_cis", "RS1RS2_trans", "RS1RR2",
                "RR1SS2", "RR1RS2", "RR1RR2")

for randomMating this loop creates 16 combinations including cis/trans

  counter <- 0

  for( m2 in c('S2','R2'))
  {
    for( m1 in c('S1','R1'))
    {
      for( f2 in c('S2','R2'))
      {
        for( f1 in c('S1','R1'))
        {
          counter <- counter+1
          cat(paste(counter, m1,f1,m2,f2,"\n"))
        }
      }
    }
  }

1 S1 S1 S2 S2 2 S1 R1 S2 S2 3 S1 S1 S2 R2 4 S1 R1 S2 R2 5 R1 S1 S2 S2 6 R1 R1 S2 S2 7 R1 S1 S2 R2 8 R1 R1 S2 R2 9 S1 S1 R2 S2 10 S1 R1 R2 S2 11 S1 S1 R2 R2 12 S1 R1 R2 R2 13 R1 S1 R2 S2 14 R1 R1 R2 S2 15 R1 S1 R2 R2 16 R1 R1 R2 R2

reformat slightly : 1 SS1 SS2 2 SR1 SS2 3 SS1 SR2 4 SR1 SR2 5 RS1 SS2 6 RR1 SS2 7 RS1 SR2 8 RR1 SR2 9 SS1 RS2 10 SR1 RS2 11 SS1 RR2 12 SR1 RR2 13 RS1 RS2 14 RR1 RS2 15 RS1 RR2 16 RR1 RR2

Perhaps I can do the calculations on these 16 & then aggregate to produce the 10 ?

#these are other ways of creating the 16

#as an array
arr <- createArray2(loc1all1=c('S','R'),loc1all2=c('S','R'),loc2all1=c('S','R'),loc2all2=c('S','R'))

#as a dataframe
dF <- expand.grid(loc1all1=c('S','R'),loc1all2=c('S','R'),loc2all1=c('S','R'),loc2all2=c('S','R'))
> dF
   loc1all1 loc1all2 loc2all1 loc2all2
1         S        S        S        S
2         R        S        S        S
3         S        R        S        S
4         R        R        S        S
5         S        S        R        S
6         R        S        R        S
7         S        R        R        S
8         R        R        R        S
9         S        S        S        R
10        R        S        S        R
11        S        R        S        R
12        R        R        S        R
13        S        S        R        R
14        R        S        R        R
15        S        R        R        R
16        R        R        R        R

genotypes <- paste0(dF[,1],dF[,2]," ",dF[,3],dF[,4])
#[1] "SS SS" "RS SS" "SR SS" "RR SS" "SS RS" "RS RS" "SR RS" "RR RS" "SS SR" "RS SR" "SR SR" "RR SR" "SS RR" "RS RR" "SR RR" "RR RR"

> length(genotypes)
[1] 16

#or could miss out the space

l1a1 <- substr(genotypes,1,1)
l1a2 <- substr(genotypes,2,2)
l2a1 <- substr(genotypes,4,4)
l2a2 <- substr(genotypes,5,5)

#these are homozygous at both loci and have 1:1 between expanded & contracted genotypes
which(l1a1==l1a2 & l2a1==l2a2)
[1]  1  4 13 16

#this gets the cis/trans
which(!(l1a1==l1a2 | l2a1==l2a2))
[1]  6  7 10 11

But I might not to be this fancy. If I do all the working in the expanded genotype storage, and then just have a single function to do the conversion ??

genotypesContract()
or genotypesLong2Short()

done : convert from the expanded genotype storage to the contracted one.

Beware I do seem to have occasional small difference between runModel & runModel2. I set the random seed so that the difference is consistent.

It now seems only to turn up if I have >20 scenarios.

It's not to do with randomMating() or genotypesLong2Short because they are not fully implemented yet.

Failure(@testRefactoring.r#21): refactoring named variables with arrays doesn't change results runModel2(input, produce.plots = FALSE) not equal to runModel(input, produce.plots = FALSE) Component "results": Component 25: Mean relative difference: 6.391493e-08

It does show some very small differences, but I suspect they are just rounding errors, e.g. this is how they start out. (and the M,F columns just contain 1 despite showing this difference here)

listOut2$results[[25]] - listOut$results[[25]]

       Gen          m.R1          m.R2          m.LD          f.R1          f.R2          f.LD             M             F
[1,]   0  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00
  [2,]   0  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00
  [3,]   0  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00 -2.220446e-16 -2.220446e-16
  [4,]   0  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00
  [5,]   0 -2.168404e-19 -2.168404e-19  0.000000e+00 -2.168404e-19 -2.168404e-19  0.000000e+00 -1.110223e-16 -1.110223e-16

tested that my new W.bar calc works : W.bar m f 0.1000036 0.1000036 Browse[2]> W.bar.m a.m_00 0.1000036 Browse[2]> W.bar.f a.f_00 0.1000036

13/5/15 10-14 2

dealing with cis/trans :

namesLoci <- c('SS1SS2','SS1RS2','SS1RR2',
               'RS1SS2','RS1RS2cis','RS1RS2trans','RS1RR2',
               'RR1SS2','RR1RS2','RR1RR2')
#or
namesLoci <- rownames( genotype.freq )
sex <- c("F","M")
f <- createArray2( sex=sex, loci=namesLoci )
#i then may be able to refactor more later

Plus see randomMating() which uses something like below, and then genotypesLong2Short.r to convert from the expanded genotype format.

  counter <- 0

  for( m2 in c('S2','R2'))
  {
    for( m1 in c('S1','R1'))
    {
      for( f2 in c('S2','R2'))
      {
        for( f1 in c('S1','R1'))
        {
          counter <- counter+1
          cat(paste(counter, m1,f1,m2,f2,"\n"))
        }
      }
    }
  }

deleting old code from runModel2()

done to f[m,loci] : sort storage of f.m.RS1RS2_cis
done : refactor gamete calculations : G.m.R1.R2 <- G.m.R1.R2 + f.m.RS1RS2_cis ...
done : sort calc of W.bar.m from Windiv
done : remove reliance on W.m.SS1SS2 etc. (Windiv)
done : remove reliance on a.m_00 etc.

So far down from 1590 lines to 685, with only a bit in new functions

20/5/15 11-4 in liverpool meeting with Beth and train back

New bit from the manuscript by Ian about including sex-linked genes by a bit of trickery.

It is also possible to allow one of the loci to be sex-linked; here we assume that sex linkage is at Locus 1, and ‘conventional’ sex-determination i.e. that the females is hemozygous sex (XX), the male is XY, and that the gene exists only on the X chromosome. This option is important as Anopheles gambiae has only 2 autosomes and one sex chromosome; approximately ??% of the genes are on the X in this species.

The female genotype frequencies are as given on Table S1 and as described in Equation 4 and Equation 5 because females have two copies of the X chromosome. Locus 1 is homozygous in the male so heterozygotes are impossible at this locus; in this case the allele inherited by males at locus 1 is the maternal-derived one (because they get their X chromosome from their mother and the Y from the father). Hence the male genotypes can be derived by a process similar to that of females and autosomes as shown on Table S2. The males will be simulated as “RR” or “SS” at the locus even though, in reality” they will be either “R-“ or “S-“. An inherent assumption is therefore that the hemizgous males “R-“ and “S-“ have the same fitness as the equivalent “RR” and “SS” females; this seems reasonable given dosage compensation and is an assumption commonly made.

The male “diploid” genotypes are derived by a similar process to that above and the genotypes are as given on Table S2. The calculation become slightly more complicated to those given above but the same process occurs i.e. Equation 4 becomes

Equation 6

The next genotype can be produced by two combinations so Equation 5 becomes

Equation 7 ...

IH thinks this short-cut will work because selection will be OK, and the males will also correctly pass on their maternal X-linked allele (Table 4). We need to discuss this.

It just means we need edit the code slightly. i.e. calculate the female genotypes then if locus 1 in autosomal just set the male frequencies equal to the females (as we do already), else re-calculate the male frequencies according to Table S2.

end of bit from manuscript I think this is the important part for implementation :

we assume that sex linkage is at Locus 1, females XX, the male XY, and that the gene exists only on the X chromosome.

If sex-linked Locus 1 is homozygous in the male so heterozygotes are impossible at this locus the allele inherited by males at locus 1 is the maternal-derived one (because they get their X chromosome from their mother and the Y from the father). males will be simulated as RR or SS at the locus even though, in reality they will be either R- or S-

I think this change needs to be made in the randomMating() function. Currently i think randomMating() doesn't differentiate between the sex of the offspring.

Does sex linkage have further implications in terms of the generation of gametes ? maybe not

22/5/15 To stop things being tracked by git that you have just added to gitignore

git rm --cached `git ls-files -i -X .gitignore`

Then I just did commit from RStudio.

somehow I've broken the check Failure(@testRefactoring.r#12): refactoring named variables with arrays doesn't change results runModel2(input, produce.plots = FALSE) not equal to runModel(input, produce.plots = FALSE) Component "results": Component 1: 'is.NA' value mismatch: 186 in current 0 in target Its because of Warning messages like this: 1: In runModel2(input) : Male frequencies before selection total != 1 0.997966163681722

Perhaps I need to make the checks less sensitive ?? Is it because I re-enabled checks that had been disabled by Beth ? I reduced the tolerance in checks, runs OK, but tests still fail.

Points to some difference having crept in ...

difference is in genotype outputs in generations after 1 all those starting with RS1

to do with object genotype .... Aha! problems were due to my new sex-linked changes that were operating without me realising. So I should probaly remove the tolerances I put in. and work out how to reassign the impossible genotype freqs. done

4/6/15 thurs 12.15-

checking on implementation of sexLinkage in randomMating()

sexLinked probably needs to be included as an extra paramter in input (but should make it robust to cope if an input file doesn't have the sex-linked option)

be careful because this could break previous model runs

genotype.freq mostly removed it was the same as f['m',]. Creation of f & fs moved to before the generations loop. This was prompted because previously genotype.freq was the same for males and females and it adds an unecessary extra layer of potential confusion.

added calculating genotypes of both m&f in runModel2() to allow sex linkage.

bug in random mating with sex linkage genotypes aren't summing to 1

fGenotypeExpanded for non sex-linked looks like the top one In the lower one the R1 homozgotes should sum to 1 too

l1a2

l1a1 S1 R1 S1 0.0625 0.0625 R1 0.0625 0.0625

l1a2

l1a1 S1 R1 S1 0.125 0.0000 R1 0.000 0.0625

Fixed now : with this bit of code :

#heterozygotes at locus1 are impossible so add to the homozygotes instead
if(f1!=m1)
{
  #add to the homozygotes : f1,f1
  fGenotypeExpanded[f1,f1,f2,m2] <- fGenotypeExpanded[f1,f1,f2,m2] + fThisGenotype
} else #i.e. if not heterozygous at locus 1
{
  #just add to this genotype : f1,m1
  #have to add rather than set in case this is a homozygote that has already been added to by previous condition
  fGenotypeExpanded[f1,m1,f2,m2] <- fGenotypeExpanded[f1,m1,f2,m2] + fThisGenotype
}

Created a new shiny app in shinyCurtis2 and run from runUI2.R to look at graphs with & without sex linkage side by side. Initially these gave identical results, which was because resistSimple() called the old runModel() which doesn't allow sex linkage. Try it again now.

Cool, now I am getting a difference between locus1 & locus2. However males & females still have the same results, is that what would be expected ?

Maybe it's because of the way that results are output from Beths one sex system ? Where does the graph get the data from ?

genplot <- plotallele.freq.andy( listOut$results[[1]] )

# Males lines( mat[,1], mat[,2], col="darkblue", lwd=2, lty=13 ) lines( mat[,1], mat[,3], col="green", lwd=2, lty=14 ) # Females lines( mat[,1], mat[,5], col="red",lwd=2, lty=15 ) lines( mat[,1], mat[,6], col="orange", lwd=2, lty=16 )

so its columns 2,3,5 & 6 I'm interested in

so where is listOut$results[[1]] filled ? in runModel2() listOut$results[[i]] <- results & further back :

      ## frequency of resistance alleles
      m.R1 <- sum(f['m',grep("RR1",colnames(f))]) + ( 0.5 * sum(f['m',grep("RS1",colnames(f))]))
      m.R2 <- sum(f['m',grep("RR2",colnames(f))]) + ( 0.5 * sum(f['m',grep("RS2",colnames(f))]))
      f.R1 <- sum(f['f',grep("RR1",colnames(f))]) + ( 0.5 * sum(f['f',grep("RS1",colnames(f))]))
      f.R2 <- sum(f['f',grep("RR2",colnames(f))]) + ( 0.5 * sum(f['f',grep("RS2",colnames(f))]))   
      results[k,2] <- m.R1
      results[k,3] <- m.R2
      results[k,5] <- f.R1
      results[k,6] <- f.R2

Seems that this should be doing the right thing. Maybe the allele frequencies overall don't change, because for the males they just redistributed from heterozygotes to homozygotes.

5/6/15

Instructions emailed to Ian & Beth to run UI2 require(devtools) install_github('AndySouth/resistance') require(resistance) runUI2()

Ians feedback on UI2 (sex-linkage)

The R script runs fine. The results are as I would have expected given that locus 1 is sex-linked and locus 2 is autosomal i.e. resistance evolves faster at the sex-linked locus.

The dynamics at locus 2 appear unchanged...I therefore assume the drugs were deployed individually i.e. not as a combination?

But I thought they were applied as a combination. The exposure comes from here in the UI :

h5("Exposure to each insecticide"), numericInput("a.m_AB", "same for both insecticides in Curtis", 0.5, min = 0.1, max = 0.9, step = 0.1)

and is set here in server : a.m_AB = input$a.m_AB, a.f_AB = input$a.m_AB, #set f to same as m

are these the correct parameters to set ?? yes i think so.

Thanks Ian, In relation to your 2nd comment, from the UI, both a.m_AB and a.f_AB are set from the box with "Exposure to each insecticide".

It's my understanding that should result in exposure to high levels of both insecticides (i.e. in combination).

Is that correct Beth ?

Didn't get a reply from Beth yet. From Ian : OK, I was just speculating that if resistnce at locus 1 was spreading faster due to sex-linkage then so might resistnce at locus2 through genic linkage and hitch-hiking. I'll give its some more thought over the weekend.

be careful that runModel & runModel2 should no longer generate same results when sex linkage is set (but i don't think this should break current testthat tests)

8/6/15 on train to liverpool 4.30-9.30 The plan for running the sensitivity analysis for the first paper

Edited from Ians text in the paper.

For each parameter combination 2 strategies : A) single ‘new’ insecticide alone B) ‘new’ insecticide in combination with ‘older’ insecticide

The simulation output was the ratio of number of generations for resistance to the ‘new’ insecticide to reach a given allele frequency when in combination compared to on its own.

A ratio >1 indicates that combination is better.

Five threshold frequencies were investigated: 0.05, 0.1, 0.2, 0.5, 0.8.

But in the document : "Suggested sensitivity analysis.docx" it talks about sequential and 2 mixture optoins.

Sequential use:, use one insecticide until resistance frequency reaches critical point, then switch insecticide until second resistance allele frequency reaches the critical point. Record time in total generations for the two insecticides.

Mixture option 1. Deploy mixture and record time in generations until EITHER frequency reaches the critical point

Mixture, option 2: deploy mixture until one resistance allele reaches the critical point, then stop using that insecticide in the mixture. Record time until both resistance allele frequencies reach critical point.

Easy by post-processing : Mixture 1 : set exposure to both insecticides Trickier : Sequential Trickiest : Mixture 2

Mixture 1 : set exposure to both insecticides and post process to get time until critical points for EITHER insecticide is reached. Sequential : set callibration to 1013, add bit to runModel2() to switch from insecticide1 to insecticide2 when critical point for the first insecticide is reached. Post process results file to find the time that the 2nd critical point is reached.

Tried adding a sequential insecticide scenario as an example to runModel2(), doesn't seem to be working yet, or maybe I just haven't passed it the correct args.

aha, i think probably because the niche toggles not switched on. Can I set all of these to on by default so that this problem doesn't happen again ?? done

resistance to 2nd insecticide still seems not to rise.

It may be because I need to reset the fitnesses (which happens before gen loop):

Windiv[sex,locus1,locus2] <- sum( a[sex,,] * Wniche[locus1,locus2,,]) and could put this into a function called setFitness() or the like.

9/6/15 4-5

Chatting with Ian & working out that both sequential and mixture2 scenarios can also be done by post-processing of the output file.

Now looking at setting up sensiAnPaper1() to run the sensitivity analysis for the paper. Realising that I might not be able to call sensiAn1 and setInputSensiScenarios() because of some of the dependencies between parameters.

Particularly that P_2 needs to be set as a proprtion of P_1.

Actually sensiAnPaper1() can be much more straightforward if it has hardcoded args and doesn't need to be so flexible.

10/6/15 train back from liverpool

modifying sensiAnPaper1() to run the different insecticideUsed scenarios started sensiPostProc() to do post-processing

24/6/15 in liverpool before talking to Ian

checking where I have got to.

I'm going to renam sensiAnPaper1 to sensiAnPaperPart() then create a sensiAnPaperAll() to run everything (although in practice unlikely to run it all at once)

29/6/15 little on sensiPostProc()

30/6/15 moved sensiAnPaper1All() into an rmarkdown file - because in reality I'm not likely to run it all at once.

Beths work just uses male allele frequencies to assess criticalPoints m.R1 & f.R1 are the same for now, but it might be safer for me to use an average.

~ added the input matrix onto the listOut output object in runModel2() to help with post processing Slight issue that this has now broken my testthat tests !!

How can m.R2 and f.R2 be greater than 1 (e.g. they start at 1.7455 in first run) ? Is this because the frequency of R2 is set too high at start ?

m.R2 is calculated at line 801 in the older runModel.r and line 376 in runModel2

Frequency of resistance allele to the 2nd insecticide is starting above 1.

m.R2 <- ( f.m.SS1RR2 + f.m.RS1RR2 + f.m.RR1RR2 ) + ( 0.5 * (f.m.SS1RS2 + f.m.RS1RS2_cis + f.m.RS1RS2_trans + f.m.RR1RS2 ) )

refactored

m.R2 <- sum(f['m',grep("RR2",colnames(f))]) + ( 0.5 * sum(f['m',grep("RS2",colnames(f))]))

Can I create a reproducible example ? or should I just ask first ?

What is the starting value of P_2 the starting frequency of resistance to insecticide2 ? For the first dodgy run listOutMix$input['P_2',1] P_2 1.74551

Aha yes, so maybe i should put something in to stop it going above 1 ??

I think problem was because I set the upper limit on P_2 to 100 rather than 1 (based upon Ians table).

FRom Ian : A trick one that..

I think my first instinct was to not restrict one insecticide resistance frequency to be always less than the other. But we would have to censor the inputs because if frequency at locus 1 is 0.1 then the frequency at locus 2 cannot be 100 times higher.

I think we should go with your suggestion i.e. recognise that frequency at one locus will always be less than at the other and the lower frequency occurs at Locus 2. I would maybe go from 0.0001 to 1 under these circumstances so we get a wide range of starting frequency ratios.

Cool findResistancePoints() seems to be working

findResistancePoints(listOut, locus=1) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] gen_cP0.1 9 10 5 36 4 3 5 19 48 77 12 10 3 2 4 3 31 36 5
gen_cP0.25 20 20 27 70 15 16 11 41 72 120 19 16 17 5 8 9 54 71 20
gen_cP0.5 27 29 39 89 28 25 18 56 97 146 25 23 36 9 12 16 73 95 39
findResistancePoints(listOut, locus=2) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] gen_cP0.1 4 31 21 76 25 14 7 30 23 301 59 8 34 2 25 91 19 26 97
gen_cP0.25 8 48 45 128 56 26 16 40 35 334 79 12 42 5 30 118 29 53 134
gen_cP0.5 13 61 62 154 77 36 24 51 48 357 90 17 51 9 34 133 39 81 166
findResistancePoints(listOut, locus='either') [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] gen_cP0.1 4 10 5 36 4 3 5 19 23 77 12 8 3 2 4 3 19 26 5 gen_cP0.25 8 20 27 70 15 16 11 40 35 120 19 12 17 5 8 9 29 53 20
gen_cP0.5 13 29 39 89 28 25 18 51 48 146 25 17 36 9 12 16 39 81 39
findResistancePoints(listOut, locus='both') [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] gen_cP0.1 9 31 21 76 25 14 7 30 48 301 59 10 34 2 25 91 31 36 97
gen_cP0.25 20 48 45 128 56 26 16 41 72 334 79 16 42 5 30 118 54 71 134
gen_cP0.5 27 61 62 154 77 36 24 56 97 357 90 23 51 9 34 133 73 95 166

but the 2nd mixture scenario is the tricky bit because

need to know which of the 2 insecticides reached the critical point first, and the resistance level of the other. I think I'll need to create a special function.

findResistancePointsMixResponsive()

31/6/15 Quick review of what needs to go into the final decision tree.

slight rethink : I want the input files to be mostly the same for I1,I2 & mixture, so that they can be added together in combined scenarios.

I added a set.seed into sensiAnPaperPart() This shows the input files for diffeerent insecticide strategies are now the same.

tst <- listOutMix$input[c('P_1','P_2','a.f_AB','phi.SS1_A0','phi.SS2_0B','h.RS1_A0','h.RS2_0B','s.RR1_A0','s.RR2_0B'),]

note a.f_A0 rather than a.f_AB

tst2 <- listOutI1$input[c('P_1','P_2','a.f_A0','phi.SS1_A0','phi.SS2_0B','h.RS1_A0','h.RS2_0B','s.RR1_A0','s.RR2_0B'),] identical(as.numeric(tst),as.numeric(tst2))

6/7/15 plotcurtis_f2_generic : a function allowing mixture & individual scenarios to be compared, similar to Curtis Fig2.

In creating a shiny app to display this, I could allow user to select parameters, but then the run might take a little while because it has to do 3 runs (I1,I2,mix) & post processing. Quicker may be to display my sensitivity analysis results & allow user to step through scenarios. + include printing out the input values used (& maybe even their relative position within ranges).

7/7/15 resistanceMaster() does reproduce curtis fig2 so the parameters in system.file("extdata","input.parameters.csv", package="resistance") must work

what is different about the input params in the sensitivity analysis ?

resistanceMaster() still calls runModel() if I change it to call runModel2() does it still work ??

No I get this : Error in runModel2(input, calibration) : Error in male exposures: must total one: 1.9

but in the input files, looks like it is -- 0.1 & AB 0.9 for all.

No not in the hardcoded, curtis fig2 input which comes from the program. It has a 0.9 for ab which slips in too.

                                         B (HCH)   A (DDT) Combination

Calibration (100 default) 1012.0000 1012.0000 1.012e+03 Number of generations 70.0000 70.0000 1.600e+02 Collect fitness scores in matrix (1/0) 1.0000 1.0000 1.000e+00 Export fitness scores matrix to .csv (1/0) 0.0000 0.0000 0.000e+00 Frequency of R at locus 1 0.0100 0.0100 1.000e-02 Frequency of R at locus 2 0.0100 0.0100 1.000e-02 Recombination Rate 0.5000 0.5000 5.000e-01 Exposure Males -,- 0.1000 0.1000 1.000e-01 Exposure Males a,- 0.0000 0.0000 0.000e+00 Exposure Males A,- 0.0000 0.9000 0.000e+00 Exposure Males -,b 0.0000 0.0000 0.000e+00 Exposure Males -,B 0.9000 0.0000 0.000e+00 Exposure males a,b 0.0000 0.0000 9.000e-01 Exposure Males A,B 0.0000 0.0000 9.000e-01 ...

exactly where is that input object created ? createInputMatrix() fixed bug, but it didn't effect the output

can I compare the inputs for the 3 scenarios by Beth to those in my first scenario ? in a csv.

This gets the file to create Curtis Fig2.

inputCurtis <- createInputMatrix(FALSE)

Can I paste the input for the first scenario onto it ? Or might be better just to output all 100 scenarios.

see :

seems that the dominance coefficient of locus2 in -B is much lower in Beths than in the sensi scenarios.

Dominance coefficient L2 in -,B 0.0016 0.0016 0.00016 0.944675269 0.497699242

First see if the curtis fig2 persists when dominance for the mixture is reduced.

Good news that this correction seemed to make the graph closer to Curtis. #input[37,3] <- 0.00016 #Dominance coefficient in B #? should this be the same as the value in ,2 & ,1 input[37,3] <- 0.0016 #Dominance coefficient in B

7/7/15 new version of UI comparing scenarios with greater differences in dominance. Still doesn't produce something similar to curtis fig2 where mixture does much better.

created runcurtis_f2() to do runs to recreate curtis fig2. recreates it fine.

created shinyFig2Curtis, that calls runcurtis_f2()

for some reason it currently doesn't recreate the curtis plot. what inputs are wrong in shinyFig2Curtis ? aha! the difference was just P_1 & P_2 which were set to 0.1 rather than 0.01 aha2, even after that fixed, something still not quite right

this was because phi.SS2_0B = 1 was set to 0 instead. This is the fitness of exposed susceptibles. (1 suggests that the susceptible is completely fit in presence of the insecticide which doesn't seem right).

yes phi.SS2_0B seems key : compare these 2:

#curtis default phi.SS2_0B = 1 : mixture better than sequence
runcurtis_f2( P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 1 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43 )
#reduce phi.SS2_0B to 0.6 : mixture same as sequence
runcurtis_f2( P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 0.6 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43 )

How does that relate to this comment from the paper ? In particular, Curtis [1] allowed the resistant and sensitive homozygotes to survive DDT exposure with finesses of 50% and 27% respectively (caption to his Figure 2).

packageDescription('resistance')

GithubSHA1: 7858f1edc0022233c0cd26f84be64e1a9bd0cd04 7858f1edc0022233c0cd26f84be64e1a9bd0cd04

This shows that the installed package is the latest version from github.

problem was because I hadn't updated tghe NAMESPACE Ctrl Shift D

reran model with new ranges for phi vals (0.6 to 1) (but still need to sort finite 'xlim' values errors)

e.g. scenario 2

Email from Ian, showing section vi of sensitivity doc.

We have to ensure the fitness of the RS and RR genotypes does not exceed 1.0 as this is the reference (maximum value of the SS genotype in the absence of insecticide. The fitness of the SS homozygote is determined by the ϕ values in Table 3 i.e. as 1- ϕ. Hence the values of s must be less than ϕ. I suggested we set s=x ϕ where x comes for a uniform distribution between 0.2 and 1.0”

This relates to this bit of code :

  for( exposure in c('lo','hi') )
  {
    Wloci[ paste0('SS',locusNum), exposure] <-  1 - phi[locusNum, exposure]

    Wloci[ paste0('RS',locusNum), exposure] <- (1 - phi[locusNum, exposure]) + 
                                               (h[locusNum, exposure] * s[locusNum, exposure])

    Wloci[ paste0('RR',locusNum), exposure] <- (1 - phi[locusNum, exposure]) + 
                                               (s[locusNum, exposure])
  }

Wloci[RR] <- (1-phi) + s

so s must be < phi to keep Wloci[RR] below 1

added in checks for fitnesses >1 or <0 in runModel2

initial tree looks promising : for mix1 better than sequential: phi.SS2_0B < 0.66 phi.SS1_AO < 0.68 P_2 > 0.033 P_1

10/7/15 installed rpart.plot for prp function used by susana

created loop to do all plots, with titles and save

repeated runs for 500 scenarios, took about an hour.

11/7/15 running 1000 scenarios started at 10.15, check timestamp on outputs to see how long it took

13//7/15

done created findResistancePointsMixResponsive() for mix2

analysing 1000 runs

#1000runs #betterMix1Seq_cP0.1 betterMix1Seq_cP0.25 betterMix1Seq_cP0.5 # 0.178 0.232 0.238 #betterMix2Seq_cP0.1 betterMix2Seq_cP0.25 betterMix2Seq_cP0.5 # 0.471 0.568 0.551 #betterMix3Seq_cP0.1 betterMix3Seq_cP0.25 betterMix3Seq_cP0.5 # 0.794 0.726 0.673 rounded #betterMix1Seq_cP0.1 betterMix1Seq_cP0.25 betterMix1Seq_cP0.5 # 0.18 0.23 0.24 #betterMix2Seq_cP0.1 betterMix2Seq_cP0.25 betterMix2Seq_cP0.5 # 0.47 0.57 0.55 #betterMix3Seq_cP0.1 betterMix3Seq_cP0.25 betterMix3Seq_cP0.5 # 0.79 0.73 0.67

essentially mix1 : 0.2 mix2 : 0.5 mix3 : 0.7

alas the trees are more complicated, but that might be fixed by pruning them ? maybe not ... does work well for : betterMix1Seq_cP0.1

BUT wierdly seems that plots look better (ie much fewer branches) when done individually than when saved in the loop.

is it something to do with interaction between prp and savePlot() ?

Now I seem to be getting a different result each time ?????

14/7/15

putting results document together.

Just checking that the results file for the 1000 runs looks like the figures & UI for the 100 runs

run3: sequential best (seq 28+29, mix3 50) run4: mixture best (seq 93+32, mix3 172) run5: resistance to one in mixture develops more slowly than both in sequence (seq 29+103, mix1 152)

resistPointsI1[,1:5] [,1] [,2] [,3] [,4] [,5] gen_cP0.1 38 999 11 57 7 gen_cP0.25 48 999 21 82 22 gen_cP0.5 53 999 28 93 29 resistPointsI2[,1:5] [,1] [,2] [,3] [,4] [,5] gen_cP0.1 64 210 7 4 91 gen_cP0.25 90 250 21 21 99 gen_cP0.5 102 268 29 32 103 resistPointsMix1[,1:5] [,1] [,2] [,3] [,4] [,5] gen_cP0.1 75 999 9 10 34 gen_cP0.25 93 999 30 57 116 gen_cP0.5 102 999 41 86 152 resistPointsMix3[,1:5] [,1] [,2] [,3] [,4] [,5] gen_cP0.1 168 999 21 129 191 gen_cP0.25 199 999 38 159 202 gen_cP0.5 213 999 50 172 207

Now can I do the 20% better results ? betterMix1Seq20_cP0.1 betterMix1Seq20_cP0.25 betterMix1Seq20_cP0.5 0.136 0.175 0.173 betterMix2Seq20_cP0.1 betterMix2Seq20_cP0.25 betterMix2Seq20_cP0.5 0.241 0.326 0.313 betterMix3Seq20_cP0.1 betterMix3Seq20_cP0.25 betterMix3Seq20_cP0.5 0.672 0.560 0.488

rounded betterMix1Seq20_cP0.1 betterMix1Seq20_cP0.25 betterMix1Seq20_cP0.5 0.14 0.18 0.17 betterMix2Seq20_cP0.1 betterMix2Seq20_cP0.25 betterMix2Seq20_cP0.5 0.24 0.33 0.31 betterMix3Seq20_cP0.1 betterMix3Seq20_cP0.25 betterMix3Seq20_cP0.5 0.67 0.56 0.49

Low phi values lead to sequential being better. e.g. https://andysouth.shinyapps.io/shinyMixSeqCompare1 runs 3: phi 0.53, 0.79 19: phi 0.65, 0.44 25: phi 0.60, 0.56

15/7/15

installed triangle package this to generate random deviates rtriangle(n, a=0, b=1, c=(a+b)/2)

changed label 'combination' to 'mixture' in my generic curtis fig2 plot

16/7/15 reran 1000 sensitivity scenarios with new agreed input ranges & reran 100 scenarios and put them into shinyMixSeqCompare1

21/7/15 check that the example runs still support the args the way they did before

the 3 example plots that I pasted into the word doc seem to be identical to what they were before ARSE is it possible I didn't reload the package so that it used the old function version with the old input ranges ?

yes I think that's what I did !

rereran 100 runs and put them into shinyMixSeqCompare1

sorted order of insecticides for sequential figures

run shinyMixSeqCompare1 in a window rather than the browser to enable copy and pasting figures into word doc

copied 3 figs into the word doc. Note that run 9 is interesting. Mixture does much better than sequential.

I'm a bit worried that the post-processed results for the new100 runs seem not to have changed even though

22/7/15 analysing 1000 rereruns

betterMix1Seq_cP0.1 betterMix1Seq_cP0.25 betterMix1Seq_cP0.5 0.223 0.245 0.243 betterMix2Seq_cP0.1 betterMix2Seq_cP0.25 betterMix2Seq_cP0.5 0.558 0.563 0.534 betterMix3Seq_cP0.1 betterMix3Seq_cP0.25 betterMix3Seq_cP0.5 0.820 0.747 0.677

rounded betterMix1Seq_cP0.1 betterMix1Seq_cP0.25 betterMix1Seq_cP0.5 0.22 0.25 0.24 betterMix2Seq_cP0.1 betterMix2Seq_cP0.25 betterMix2Seq_cP0.5 0.56 0.56 0.53 betterMix3Seq_cP0.1 betterMix3Seq_cP0.25 betterMix3Seq_cP0.5 0.82 0.75 0.68

20% better : betterMix1Seq20_cP0.1 betterMix1Seq20_cP0.25 betterMix1Seq20_cP0.5 0.163 0.178 0.182 betterMix2Seq20_cP0.1 betterMix2Seq20_cP0.25 betterMix2Seq20_cP0.5 0.283 0.302 0.283 betterMix3Seq20_cP0.1 betterMix3Seq20_cP0.25 betterMix3Seq20_cP0.5 0.622 0.536 0.458 rounded
betterMix1Seq20_cP0.1 betterMix1Seq20_cP0.25 betterMix1Seq20_cP0.5 0.16 0.18 0.18 betterMix2Seq20_cP0.1 betterMix2Seq20_cP0.25 betterMix2Seq20_cP0.5 0.28 0.30 0.28 betterMix3Seq20_cP0.1 betterMix3Seq20_cP0.25 betterMix3Seq20_cP0.5 0.62 0.54 0.46

reran trees

Best ones :

to try to find an approximate general rule

Favouring mixtures over sequential use. insecticide effectiveness (phi) > 0.7 exposure < 0.7

e.g. see treebetterMix2Seq_cP0.5.jpg

I haven't done 20% better trees yet. Should I ? done now

23/7/15 prettifying trees changed response from 0/1 to mixture/sequence

seems that it may have made the trees bigger, ARSE.

is it to do with pruning ?

tree$cptable[which.min(tree$cptable[,"xerror"]),"CP"]

30/7/15

checking on trees, and why they seem to change when I tweak formatting options add a check into sensiAnPaper1All.Rmd

The tree is the same at the top (if reversed) but has more branches when 0/1 was changed to sequence/mixture.

treeResponse <- "betterMix2Seq_cP0.5"

tree$cptable[which.min(tree$cptable[,"xerror"]),"CP"]

[1] 0.01

What was this error before I made the conversion to text ?

I tried converting back to numeric and I got the same big tree ...

but the value of the error is lightly different : tree$cptable[which.min(tree$cptable[,"xerror"]),"CP"] [1] 0.01072961

This is what the cptable looks like for the numeric version numeric

tree$cptable CP nsplit rel error xerror xstd 1 0.34549356 0 1.0000000 1.0000000 0.03385148 2 0.12875536 1 0.6545064 0.8047210 0.03285260 3 0.05150215 2 0.5257511 0.6931330 0.03173289 4 0.04184549 3 0.4742489 0.5107296 0.02889882 5 0.01824034 5 0.3905579 0.4527897 0.02768814 6 0.01287554 7 0.3540773 0.4163090 0.02683382 7 0.01180258 8 0.3412017 0.4098712 0.02667504 8 0.01072961 12 0.2939914 0.4012876 0.02645941 9 0.01000000 13 0.2832618 0.4098712 0.02667504

mixture/sequence

tree$cptable CP nsplit rel error xerror xstd 1 0.34549356 0 1.0000000 1.0000000 0.03385148 2 0.12875536 1 0.6545064 0.7532189 0.03238844 3 0.05150215 2 0.5257511 0.6287554 0.03088571 4 0.04184549 3 0.4742489 0.4957082 0.02860113 5 0.01824034 5 0.3905579 0.4699571 0.02806478 6 0.01287554 7 0.3540773 0.4527897 0.02768814 7 0.01180258 8 0.3412017 0.4420601 0.02744467 8 0.01072961 12 0.2939914 0.4377682 0.02734549 9 0.01000000 13 0.2832618 0.4442060 0.02749387

Aha! when I repeat identical model, cptable is different

tree$cptable CP nsplit rel error xerror xstd 1 0.34549356 0 1.0000000 1.0000000 0.03385148 2 0.12875536 1 0.6545064 0.7618026 0.03247193 3 0.05150215 2 0.5257511 0.6351931 0.03097749 4 0.04184549 3 0.4742489 0.5085837 0.02885696 5 0.01824034 5 0.3905579 0.4635193 0.02792537 6 0.01287554 7 0.3540773 0.4248927 0.02704169 7 0.01180258 8 0.3412017 0.4163090 0.02683382 8 0.01072961 12 0.2939914 0.4098712 0.02667504 9 0.01000000 13 0.2832618 0.4098712 0.02667504

Can I get back to the nice smaller, neater tree ??

http://stackoverflow.com/questions/29197213/what-is-the-difference-between-rel-error-and-x-error-in-a-rpart-decision-tree

A rule of thumb is to choose the lowest level where the rel_error + xstd < xerror.

http://stackoverflow.com/questions/15318409/how-to-prune-a-tree-in-r

This does that :

nsplit <- tree$cptable[ tree$cptable[,"rel error"] + tree$cptable[,"xstd"] < tree$cptable[,"xerror"],"nsplit"][1]

31/7/15 give more sensible error than "need finite 'xlim' values" from shiny curtis fig2 app set starting freq of resistance to 0.0001 for each to generate the error.

this is when resistance is not reached within max generations.

runcurtis_f2() plotcurtis_f2_generic() Error in 1:ddt_cutoff : result would be too long a vector

fixed by adding in this code to plotcurtis_f2_generic() # check if resistance has not been reached # and prevent an error, by seeting to max generations from input # beware this may have other unintended consequences if (is.infinite(hch_cutoff)) hch_cutoff <- nrow(amat) if (is.infinite(ddt_cutoff)) ddt_cutoff <- nrow(bmat)

uploaded new version to shinyapps

26/8/15 submit invoice : from contract : maximum of 25 days at 217 per day I spent 30.6 days (can claim some against tsetse work, or from future contract ...)

6/10/15 Chatting to Joe Lines at OpenMalaria TAG ~ liked the shiny front end ~ asked when he would be able to play with the model ~ would like to see plots of parameter space, 2D & 3D when mixture better than sequential ~ suggested that Curtis does not say mixtures are bad (& that Ian may have slightly skewed impression) ~~ I said that sentence in GPIRM might suggest that mixtures not as good ~ he said with classification trees he would like to see an option to influence which branches are done first based upon the operational ability to alter them

to do re Joe

~ send Joe link to the other shiny app ~ send him the sentence from GPIRM ~ ask Joe for a paper copy of GPIRM

Pre-existing resistance to one insecticide in the mixture could accelerate the development of double resistant vector populations because of linkage disequilibrium (Figure A9.1).

GPIRM p121 "theoretical models suggest that mixtures might delay resistance longer than rotations or broad mosaics (1)."

p122 "there is general agreement that LLIN mixture products, when developed, could still contain pyrethroids, since the pyrethroid may still contribute to the effectiveness of the net despite resistance (this needs to be confirmed). Some agree that the rationale applies to IRS, and that pyrethroid mixtures could be considered in areas of pyrethroid resistance; others believe that this is inadvisable and likely to lead to rapid evolution of double resistance."

p122 Pre-existing resistance to one insecticide in the mixture could accelerate the development of double resistant vector populations because of linkage disequilibrium (Figure A9.1).

At current prices, rotations and mosaics are estimated to have the lowest cost effect among the IRS-based IRM interventions. They would increase the annual cost by 20–50%, whereas use of mixtures for IRS once developed, are projected to increase the cost by 30–100%, depending on the duration of the transmission season. Because mixtures have not yet been developed or marketed, the final cost could be higher or lower than current projections.

World Health Organization. The technical basis for coordinated action against insecticide resistance: preserving the effectiveness of modern malaria vector control. Geneva, 2011.

Curtis is ref. 7. Perhaps I hadn't remembered very well what it said. Or maybe I was remembering more what is in the WHO 2011 publication ? Curtis paper not cited in WHO 2011.

7/12/15

email from Ian :

I had a major session on the 2-locus insecticide-resistance manuscript over the weekend (in between DIY; what a dull life I live!). Anyway I am now fairly happy with it . I plan to submit a IR-modelling grant application on 12th January so would like to have the paper submitted by then. I am not sure if I will manage but at least it is a target... I assume you can see the current version in Dropbox.

Andy, can you commit a couple to days to it ,or however long it takes, charging to the malaria grant (i.e. the one working with Swiss TPH). Here is a shopping list, but I think #3 is the only one that may require a significant amount of coding

As I note in the cover page: The big change I made was that calculating fitness (equations 1 and 2) was previously described as occurring within the generations. In fact it does not depend on current genotype frequencies so only needs to be done once prior to each deployment scenario i.e. before the simulation is iterated over the generations. Check the code...if it is still calculated each generation, it can be moved out, calculated once prior to the iterations, and computational speed may increase substantially. Andy, you might like to check that it does not currently depend on current genotype frequencies in the code.

The classification trees are currently based on 1,000 samples. Can you increase them to 5,000. The tree for "combinations only better if last >20% longer" needs to be re-drawn with more transparent parameter names. I have updated Table 5 with some suggested variable names for the trees... feel free to change and/or shorten them if you wish. I assume it is trivial to rename the variables in the tree graphing package. I think we need the pruned trees (i.e. within 1 s.e.) for the main manuscript and the full, unpruned trees for the supplementary material.

At the moment we simply re-create the results from Curtis. I think a slight extension will greatly improve the ms by demonstrating the flexibility of the methodology. I have proposed two small extensions: as described on Page 14/15. One is for biological realism i.e. male Anopheles mosquitoes are less likely to enter the house so will typically have lower exposure to the insecticide than females. The second is operational: even if we propose to deploy a mixture then various factors (theft, sloth, stock-outs etc) mean that one insecticide will often be absent from the mixture so the other insecticide is sprayed alone. As before, a sensitivity analysis with 5,000 runs. I think we just use these as examples then wrap up the ms by arguing that many other examples could be explored but we lack time/space so have focussed on describing the basic methodology to be used in future more policy-orientated publications

That should keep you (or more accurately your computer) busy over Christmas. We can phone and discuss if you want, just nominate a time. I am away in Geneva Wed to Friday this week inclusive so could not talk then. Sorry I can't be here on Thursday for a pint but I'm sure you will manage.

Best wishes, Ian

Other bits from the manuscript itself :

Jo Lines/Andy conversation at TAG. Suggest a few graphs where just change one parameter at a time. Need to resurrect our default parameters and easy to do….include them on Table 5 along with Curtis default parameters in other columns.

Andy. Can we include the synergism/antagonism/cross-resistance factors Λ in the code; see calibration section of methods, page 6 and Table S1

I think the classification trees each need 10,000 rather than 1,000 runs. Don’t start clogging up your computer until we decide the final format.

Recreate the “mixtures are only better if last 20% longer” classification tree for Figure 3. I know you have already done it but the nodes all had technical symbols that are difficult to interpret. They need simple names like in Figure 3A. I assume its just a question of changing the parameter names in a “R” plotting script.

Run a new set of simulations using the “Curtis extended scenario” described on Page 16 to produce new classification trees again using the two criteria used previously i.e. (a) mixtures better if last >1 generation more than sequential (b) mixtures only better if last >20% longer

page 8 : Our simulations will assume that the effects of insecticides in a mixture are multiplicative e.g. if the probability of surviving exposure to insecticide A alone is 0.3 and of surviving insecticide B alone is 0.2, then the probability of surviving expose to a mixture of A and B is 0.3*0.2=0.06. In other words there is no synergy or antagonistic interactions between the insecticides. We do build flexibility into our methodology by scaling the fitnesses in niches where mosquitoes encounter both insecticides by a factor ΛAB, ΛAb, ΛaB or Λab (Table S1). If insecticides are synergistic then Λ<1, but if insecticides are antagonistic, for example because they share the same target site or share a similar mechanism of (cross)resistance, then Λ>1. This is a flexible approach but other strategies are also possible, for example if the insecticides are highly antagonistic then the fitness may be defined as the locus with the higher individual fitness factor, w.

Where would this 'synergism/antagonism/cross-resistance' bit need to go in code ? scaling the fitnesses in niches where mosquitoes encounter both insecticides by a factor ΛAB, ΛAb, ΛaB or Λab (Table S1). If insecticides are synergistic then Λ<1, but if insecticides are antagonistic, for example because they share the same target site or share a similar mechanism of (cross)resistance, then Λ>1

in runModel2.r, probably Wniche[locus1, locus2, niche1, niche2] Around line 300, just after this : ## Two locus fitnesses in two insecticide Niche

for( nicheNum1 in 1:3 ) #todo get this 1:3 from somewhere
{
  for( nicheNum2 in 1:3 ) #todo get this 1:3 from somewhere
  { 
    #temporary solution
    #to get both niche (one of 0aAbB)
    #and exposure (one of no,lo,hi)
    niche1 <- dimnames(Wniche)$niche1[ nicheNum1 ]
    niche2 <- dimnames(Wniche)$niche2[ nicheNum2 ]
    exposure1 <- dimnames(Wloci)$exposure[ nicheNum1 ]
    exposure2 <- dimnames(Wloci)$exposure[ nicheNum2 ]

    #if this niche toggled off set fitness to 0
    if (niche[niche1,niche2] == 0)
    {
      Wniche[,,niche1,niche2] <- 0
    } else{
      #otherwise set fitness to product of the 2 loci
      for( locus1 in dimnames(Wniche)$locus1)
      {
        for( locus2 in dimnames(Wniche)$locus2)
        {   
          #6/1/16 i think ians new insecticide interaction parameter can just go here
          #does in need to be just one param or 4 ?
          #ΛAB, ΛAb, ΛaB or Λab 
          #Wniche[locus1,locus2,niche1,niche2] <- interaction * Wloci[locus1,exposure1] * Wloci[locus2,exposure2]
          Wniche[locus1,locus2,niche1,niche2] <- Wloci[locus1,exposure1] * Wloci[locus2,exposure2]
        }
      }          
    }
  }
}

18/12/15 ~ what do I need to do to get labels correct in all trees ?

Looking at the requests from Ian. I think it would be better if he & I sat down to go through it before I spend time doing 1000s more runs.

6/1/16 on train to liverpool

Questions for Ian : ~ I recomend a few days, maybe a week, finishing the restructurng of the model. It still doesn't conform well to good practice. done ~ we could go through the model itself, checking Ians code points. hat ~ What can we call the insecticide interaction parameter see 290 in runModel2() ? 4 ~ Is it just 1 param or do we need 4 ? no ~ should I remove W.bar arounf line 460 in runModel2() ~ can we go over the curtis extended scenario p16 ?

Linkage dis is bad in the context of mixtures because mosquitos are resistant to both insecticides more often than would be expected by chance. So problem is when the mixture is in use. LD decreases by half each generation under free recombination (recomb rate 0.5) . As genes get closer on chromosome recomb rate decreases.

Extended Curtis requests from Ian 1. 'Male exposure' as a proportion of female (0.5 to 1) Exposure is held in array a, read from input file by sex, line 90. a['m','0','0'] <- input[8,i] I can add a parameter to the setExposure() function which is called from sensiAnPaperPart.r

'correct deployment' for mixtures (0.5 to 1) the proportion correctly exposed to the mixture, the rest are divided equally by being exposed to high concentrations of each individual each e.g. exposure 0.8 and correct deployment 0.7 : 0.8x0.7=0.56 are exposed to niche AB, while [0.8x0.3]/2=0.12 are exposed to niche A and 0.12 are exposed to niche B. Again this can be added as a parameter to setExposure()

Main actions required from Ians last email :

Can we include the synergism/antagonism/cross-resistance factors Λ in the code; see calibration section of methods and Table S1. This is about how effect of >1 insecticide are combined. see my notes above for where this is in R code.
Run a new set of simulations using the “Curtis extended scenario” described on Page 16 to produce new classification trees again using the two criteria used previously i.e. (a) mixtures better if last >1 generation more than sequential (b) mixtures only better if last >20% longer
Graphs where just change one param at a time. Jo Lines/Andy conversation at TAG.
Recreate the “mixtures are only better if last 20% longer” classification tree for Figure 3. Replace technical symbols with simple names like in Figure 3A. (see sensiAnPaper1All.Rmd rownames(treeInput)[rownames(treeInput)=="a.f_AB"] <- "exposure")
once we have decided what to do increase sensitivity analysis to 5000 or 10000 runs & run new trees

We the run a “Curtis extended scenario ” to further illustrate the processes. The simulations have two additional parameters in addition to those described above and on Table 5 i.e.

• A “male exposure” parameter which defines the proportion of males exposed to an insecticide niche as a percentage of the females; it is assumed to be the same for each insecticide niche and varies uniformly from 50% to 100%. So, for example if females are 20% unexposed, 56% exposed to ‘AB’ and 12% exposed to each of niches’ A’ and ‘B’. If male exposure in 60% of the females their expose to niches ‘AB’, ‘A’, and ‘B’ are 0.560.6= 33.6%, 0.120.6=7.2% and 0.120.6=7.2% respectively and the proportion unexposed rises to 0.2+(0.80.40)= 52%.

• A “correct deployment” parameter in the simulations examining the use of mixtures. Correct deployment is often not achieved for a variety of operational reasons and exposure to insecticides may not be as anticipated. In particular, mixtures may be mandated but not achieved for various reasons (such as stock-outs) such that the ‘mixtures’ are often replaced by the single insecticide that is available. We vary this factor uniformly between 50% and 100% and the incorrectly exposed mosquitoes are equally divided into exposure of the single insecticides. So, for example, if exposure is 80% and correct deployment” is 70% then within that 80% exposure to insecticides 0.8x0.7= 56% are exposed to niche AB, while [0.8x0.3]/2= 12%are exposed to niche A and [0.8x0.3]/2= 12% are exposed to niche B. In reality, poor deployment will result in exposure to lots of other niches i.e. a, b, Ab, Ab, ab but we only examine high levels of insecticide here to avoid having to additionally define survival probabilities to low insecticide levels ‘a’ and ‘b’.

9/1/16 on train back from liverpool game rollout workshop

modifying setExposure() function to cope with Ians request for lower male exposure I can add it in as a param with a default of 1.

13/1/16 starting on the new 1 parameter at a time sensi analysis in sensiAnPaperPartOneAtATime.r

14/1/16 first see if I can make any informative plots with the existing sensitivity analysis data

These outputs are the frequencies by generation of the resistance alleles for the different insecticides (1&2) Gen, m.R1, m.R2, f.R1, f.R2

I want to get to a single file that has inputs & time_to_resistance thresholds for each scenario.

I must have something nearly like this already from sensiAnPaper1All.Rmd

Yes : resistPointsI1,2,Mix etc. For each scenario these have gen_cP0.1,0.25,0.5 which is the number of generations taken to reach those resistance thresholds.

The next chunk works out which strategy is better than which.

Maybe I want to bind all the resistPoints together into one file and bind onto the inputs, so that we can look at how the inputs effect the time to resistance for the different strategies.

I want input on the x axis & time to resistance on the y. (but there will be the issue that the y values will be affected by the changing values of other parameters too) Cross that bridge when I come to it.

how to get data in format for ggplot2 check the diamonds dataset in ggplot2

so I want to get as a dataframe with scenarios in rows, & columns for inputs & outputs

so I can do something like : ggplot(diamonds, aes(x=carat, y=price, color=cut)) + geom_point()

If I have a column that is strategy (insecticide1, insectide2, sequential, mixture) Then I would be able to facet by these later.

ggplot(diamonds, aes(x=carat, y=price, color=cut)) + geom_point() + facet_wrap( ~ cut)

This could be good as I can see the different strategies side by side.

So I want columns : input1, input2 ... strategy, outputs : gen_cP0.1, 0.25, 0.5

so the dataframe will have length=num strategies*num scenarios

I'll need to rbind the strategies on top of each other.

Some success with ggplot plots.

15/1/16 sent first version of ggplot doc (effectOfIndividualInputs.pdf) to Ian & Beth

prettifying parameter names

20/1/16 on train to liverpool

removed from plots runs where resistance threshold not reached

22/1/16 with Ian in Liverpool I said realistically it would be a month before I could get started on the

conventional wisdom of poor spraying increasing resistance is because poor spray thought not to kill heterozygotes. we see some o this effect in the dominannce param. Later we can loook at by the low exposure niches e.g. a vs A. We can look at in the next paper.

Ian plans to submit to PLOS computationally biology with the aim that it's a flexible model that can address a bunch of situations. Do we want to be slightly carful to protect our IP ? e.g. in the first version w might not want to promote use by others (we'd need to do some more wrk to make it properly useable). I can discuss this with Ian when we are closer to publication.

~ I did think that correctDeploymentProp only effects mixtureswhich would constrain tree analyses for whether mix better than seq, but actually analysis can still be done, you would just expect mix to be closer to seq.

~ added maleExposureProp & correctMixDeployProp into tree & plot analyses, by adding on to the end of the input object. inputs are got from treeInput <- listOutMix$input which is created by runModel2() just from the input object that is passed to it.

~ added as an extra argument to sensiAnPaperPart() that would make less code repetition

25/1/16 added to effectOfIndividualInputs.Rmd ~~ new predictors for males & deploy ~~ calc of new ratio, copy from sensiAnPaperAll

13.40 started 1000 reps of the 'extended' experiment < 50 mins to run

This is what Susanan did for her 2012 model :

Simulations to understand the influence of each parameter on the outcome variables (mean fitness and change in resistance allele frequency) were performed using latin hypercube sampling (LHS) to generate a data set and partial rank correlation coefficients (PRCC) calculated to provide a quantitative measure of the impact of each parameter [24]. LHS techniques were first developed to explore the behavior of complex models in economics, engineering, chemistry and physics and have been used in models predicting the impact of insecticide-treated nets on malaria transmission [25].

The analysis was performed using R software [26] and implementation of LHS using package lhs. It does not allow for the specification of each variable distribution beforehand, so sampling was performed assuming a uniform distribution. Once the sample was generated, the uniform sample from a column (variable) could be transformed to the required distribution (Table 3) by using quantile functions (using the qtriangle comand in R).

A data set of 3,000 replications was generated, with random parameter sets and the corresponding values of the outcome variables using equations 8, 9, 12 and 13. Ten replicates of this procedure were performed as suggested in [24] to investigate the predictive precision of model using LHS as the sampling method. This was achieved by analysing each replicate separately and verifying that results were consistent across ten replicates.

26/1/16 looking into prcc analysis of model outputs & whether I might want tio use lhs.

~ what is difference between lhs and what I have done ~ check lhs package ~ lhs would ensure a more uniform distributions of samples across parameter space, i would think results are unlikely to be different to our monte-carlo approach. ~ it might be tricky for the params where we haven't used a uniform distribution (although we probably could get around) ~ some advice on using lhs package http://r.789695.n4.nabble.com/latin-hypercube-sampling-td4659028.html

~ add PRCC code This provides some (dated) advice on running a sensitivity anlaysis including LHS & prcc https://cran.r-project.org/web/packages/pse/vignettes/pse_tutorial.pdf

pcc function in package sensitivity may do what we want : https://cran.r-project.org/web/packages/sensitivity/sensitivity.pdf

pcc computes the Partial Correlation Coefficients (PCC), or Partial Rank Correlation Coefficients (PRCC), which are sensitivity indices based on linear (resp. monotonic) assumptions, in the case of (linearly) correlated factors.

~ or could use epiR package epi.prcc(dat) # where final column in dat is the output, all other columns are inputs

I got sensitivity::pcc() working for a single strategy.

28/1/16 ~ fixed diagonal axis labels. ~ get sensitivity::pcc() working for all strategies & putting data into format for faceted plots.

29/1/16

~ why does correctMixDeployProp seem to have an effect in the PRCC of insecticide1 when it shouldn't have any effect outside of mixtures ?

In the plots againts time_to_resistance there is only a bit of a pattern for Mixture1 (time_to_R increaeses between 0.9 & 1 correct deployment). Points to it being a problem in the PRCC itself.

~~ in setExposure() looks fine, its not applied to insecticide1 or 2

other things it could be : ~~ switched with another var in the call to setExposure ~~ something about how runs are done in sensiAnPaperPart.r ~~ rows getting reordered within effectOfIndividualInputs.Rmd ?

it is +vely correlated with time to resistance, so better application leads to longer times for resistance to be reached (not sure whether that is waht I would expect ?)

aha PRCC results are influenced by the presence of the other vars, if just do for exposure & deployProp deployProp goes down to 0.2. x2 <- x[c('correctMixDeployProp','exposure')] pcc_res2 <- pcc(x2, y, rank=TRUE) pcc_res2$PRCC original correctMixDeployProp 0.2167706 exposure -0.5248086

I still don't quite understand why an essentially random var is correlated with the output ??

This shows that there's very little correlation between the in & out, for insecticide1

cor(xy$time_to_resistance0.25,xy$correctMixDeployProp) [1] 0.003849606

1/2/16

Checking again on whats happening with correctMixDeployProp

is the correlation similar for mixture1 as it is for insecticide1 ?

could problem be that the inputs are got from a mix run ? shouldn't matter treeInput <- listOutMix$input

could it be something to do with the random number seed being disrupted by the extra params ? shouldn't happen

i think the inputs should be identical.

I think it may be a correlation in the analysis.

What I have done is generate inputs for 'male proportion' and 'correct mix deploy'. Correct mix deployment is not used in the individual insecticide runs, but it is still included as a potential predictor in the PRCC analysis.

In the mixture scenarios 'correct mix deploy' effects exposure ...

exposure is not a single variable in the simulation, but is it used as a single value as a predictor in the sensitivity & prcc analyses ?

aha the exposure predictor is taken from the female AB (actually I think that is OK ??) although maybe not because it will be reduced by correct deploy and maybe by male (or maybe not ???)

rownames(treeInput)[rownames(treeInput)=="a.f_AB"] <- "exposure"

correctDeploy has a strong interaction with exposure in mixtures.

When correct deploy is low in mixture runs exposure to AB is much lower because of corresponding exposure to A0 and 0B.

But i think this shouldn't effect single insecticide runs.

Am I sure I'm not mixing up rows somewhere in effectOfIndividualInputs.Rmd ??

I could try setting the values of results for I1 and see if it propogates through.

Instead I saved the original random value of exposure and used that in PRCC.

I think what was happening was that correctDeployProp influenced the AB exposure value for mixtures and that was the predictor I took for exposure. Instead I want the intended exposure value.

I still don't quite understand how this effects the results in the single insecticide runs.

but with my 100 test runs what I'm seeing in the PRCC is no effect of anything for insect1 & 2. and a similar pattern as before for mixture1.

This may point me towards another problem that is causing this. I may need to exclude corectDeploy as a predictor from those runs in which it wasn't used ??

2/2/16

~ ran 10000 runs over night ~ run effectOfIndividualInputs ~ now it's fine !!! PRCC looking good.

New trees from 10,000 : Predictors : effectiveness_insecticide1 :9 effectiveness_insecticide2 :9 exposure : 7

dominance_allele2 : 3 dominance_allele1 : 4 resist_start_1_div_2 : 1 start_freq_allele2 : 1 start_freq_allele1 : 2

Best tree to show (if indeed trees are the best) : Can't find a best one ! I don't like these trees !! Try to find a better method for Ian. Could it be PRCC.

Violin plots. look good. They are very similar for time to resistance 0.1, 0.25 or 0.5. ~ need a plot of proportion of runs in which mixture or sequential better. Maybe I could do a violin plot of the time to threshold, with diff strategies on x axis.

~ faceting them by exposure is cool ~ faceting by effectivity shows very little

5/2/16

trying to get plots working

GOOD RESULTS ! :

1 effectiveness < 0.6 sequential best. > 0.6 mix best.

Plot of time_to_resistance against effectiveness with 4 strategies on the same plot is key.

It shows the line for mix2 crossing that of sequential. Effectiveness 0.4-0.6 resistance arises faster for mix2. > 0.6 resistance arises faster for sequential.

I should adapt this plot perhaps just to show mix2 and sequential.

2 selection coeff < 0.2 sequential best. > 0.2 mix best.

selection coefficient curves also cross for mix2 and sequential. < ~0.2 sequential is better. > 0.2 mixture is better. But check that this is multiplied as shown below. So beware could this be caused by the effectiveness result ? ##effectiveness phi.SS1_A0 <- runif(1, min=0.4, max=1) phi.SS2_0B <- runif(1, min=0.4, max=1)

# Ian suggested this should be dependent on effectiveness to ensure fitness of RR stays below 1 
s.RR1_A0 <- runif(1, min=0.2, max=1) * phi.SS1_A0
s.RR2_0B <- runif(1, min=0.2, max=1) * phi.SS2_0B

I think perhaps I should remove multiplication of selection coefficient by effectiveness ?

3

Effect of exposure is greater overall on times to resistance, but it has a more similar effect on mixtures & sequential thus it is of less importance when considering the differences between the two strategies.

In majority of cases the point where resistance to just one of the insecticides in a mixture is reached comes before that for both insecticides in sequential use. [this seems obvious] Also it always takes longer to reach resistance for both insecticides when starting with both in a mixture. Is this obvious, or a meaningfull result or an error in our methodology ?

I think for the paper we should focus on mix2 versus sequential. We could have that as what we call mixture, and then refer in smallere analyses to the other mix scenarios. Maybe I should have a quick look at curtis paper to remind myself ?

Tried removing multiplication of selection coefficient by effectiveness. But got warnings about fitnesses > 1.

6/2/16

problem is with locus fitness values going over 1

This is the problem that causes it :

Wloci[ paste0('RR',locusNum), exposure] <- (1 - phi[locusNum, exposure]) + (s[locusNum, exposure])

so it just occurs when s > phi, selection coeff > effectiveness

current range : effectiveness : 0.4 - 1 selection coeff : 0.2 - 0.5

Options : 1) discard sims in which selection > effectiveness 2) set selection range to 0.1-0.45 and effectiveness from 0.45-1

Doing 2) might be good to check whether the line crossing still goes on ...

Yes saving 2) as listOutMix_ex2_1000.rda etc.

Done in effectOfIndividualInputs

7/2/16 with new runs

Selection coefficient curves for mix2 & sequential no longer cross. Seems likely that this previous pattern was generated by correlation with effectiveness.

9/2/16 sent new figs of PRCC of diff between seq & mix2 to Ian

10/2/16

~ reran trees they show that effectiveness > ~ 0.7 & exposure < 0.7 favours mixture2 over sequential ~ this is consistent with my graphs of difference vs effect & exposure (therefore are the trees really necessary ?)

mix2 can already be displayed in plotcurtis_f2_generic() using addCombinedStrategy I had just set default to F because the plot gets even more confusing.

Maybe add option to just plot mix2 ?

Also offer option to plot from : runcurtis_f2()

This which simply reduces effectiveness of insecticide2 from 1 to 0.85, makes seq better than mix (where mix was better for curtis).

runcurtis_f2( P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 0.85 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43 )

11/2/16 ~ added default curtis values to UI labels in server.r ~ fixed plotting of combined strategy, i think I may not be doing it right

12/2/16 ~ added position of Curtis inputs to the figures

18/2/206 Response email from Ian :

Anyway I think the plots all look very nice, and give intuitive answers (but see point #1 below). My comments are

(1) It is very strange that “correct deployment” parameter has no impact on time until resistance in mixtures.

(2) Re your question at the end of the document i.e. “In mixtures what causes resistance to the 2nd insecticide to arise more slowly when the first insecticide is removed from the mixture ?”. Isn’t it the opposite i.e. if we compare Mixture 2 (where the first insecticide is removed) then doesn’t resistance spread more rapidly compared to when the first insecticide is retained (as in mixture 3)??

(3) This question stimulated me to look again at your definitions of mixtures etc on the front page of the document. I know its only a draft and we both know what it means but I took the liberty of expanding the definition as in the attachment. Can you check they are correct and hence that I have correctly understood your question addressed in the previous point (and, if so, you may want to cut-and-paste them into the document). I also suggested a potential change (i.e. that mixture 1 and 3 be combined because they only differ in their output measure). This would clarify the captions on the plots but it just depends how easily you could cut-and-paste such changes.

One other thing that has just occurred to me. One option not considered by Curtis, but commonly proposed now, is to deploy “mosaics” where insecticides are deployed singly but at the same time…so some fields/huts may have insecticide “A” while some will have insecticide “B”. Could we simulate these mosaics simply by setting the “correct deployment” parameter to zero?? It would be another good selling point for the manuscript. If it is straightforward to set up, I would do it, If not we could just note how we could do it in the ms. It really follows on from point #1 above… if correct deployment has no effect even when it reaches zero then it implies mosaics are as good as mixtures.

Ians recomendations for captions : Single use of either insecticide. The outcome measure is time for resistance allele frequency to reach the user-defined threshold value (0.25 in these case)

Sequential : sole use of one insecticide, switch to other when threshold reached. Outcome measure is the time until frequency of resistance to the second insecticide reaches the threshold value.

Mixture 1: Insecticides always used as a mixture. Outcome measure is the time until threshold reached for either insecticide i.e. the frequency of resistance alleles exceeds 0.25 at either locus

Mixture 2 : insecticides initially deployed as mixtures until the threshold is reached for either insecticide, then switch to sole use of the other insecticide. Outcome measure is time until resistance allele frequencies exceed the threshold value for the second insecticide

Mixture 3 : Insecticides always used as a mixture. The outcome measure is the time until the threshold allele frequency exceeds the threshold value at both loci.

Actually, it would be much better if we combined mixture 1 and 3 and just note the difference in outcome measure. Mixture 3 could be renamed “Adaptive mixture”. In the figures the key could then be

Single Sequential Mixture (either AI)* Mixture (both AIs) Adaptive mixture

I assume you know this but AI is short for “active ingredient” which the commercial developers much prefer to “insecticide” for reasons that elude me

Which would be much easier for readers (and reviewers) to remember

26/2/16

look at Ians point : Re your question at the end of the document i.e. “In mixtures what causes resistance to the 2nd insecticide to arise more slowly when the first insecticide is removed from the mixture ?”. Isn’t it the opposite i.e. if we compare Mixture 2 (where the first insecticide is removed) then doesn’t resistance spread more rapidly compared to when the first insecticide is retained (as in mixture 3)??

Resistance spreads more rapidly when the 2nd insecticide is removed from a mixture. See Fig 1.

Perhaps I should rename all references to mix 1,2 & 3 to avoid confusion of getting tangled up with this again.

Sequential or seq Mixture (either AI) [mixture1] or 1st in mix or mix1 Mixture (both AIs) [mixture3] or 2nd in mix or mix2 Adaptive mixture [mixture2] or Adaptive mix or comb

Hi Ian,

Firstly, yes you are right. Resistance spreads more rapidly in the responsive mixture strategy than when the first insecticide is retained.

I agree that we need to rationalise the names to try to minimise this confusion. I would prefer to not use AI in labels if possible, but we can refer to in text. (but we can discuss).

If we could agree consistent long, medium and short names as started below that would be very helpful. I can then go through my scripts and rename. I'll have to be very careful because I will need to rename mix2 to mixA & mix3 to mix2. So if I miss any of the first mix2s we'll get in a right pickle.

Also can we agree on what to call the strategy that changes from mix to sole ? My fault that I've been variously calling it adaptive, responsive, combination. Is there a precedent ? Adaptive seems potentially confusing as it could be confused with being selected for. I have a slight preference for responsive.

Feel free to edit these below. Once we are agreed I'll print out and bluetac to my wall so that I stick to it !!

Long Names (in text where space not an issue) Mixture (1st insecticide)
Mixture (2nd insecticide) Adaptive mixture (2nd insecticide) Sequential (2nd insecticide)

Medium Names (plot labels where medium space) mix 1st
mix 2nd
mix adaptive sequential 2nd

Short Names (plot labels where v little space) mix1 mix2 mixA seq

cheers, Andy

29/2/2016

checking on correctMixDeployProp after Ian commented :

(1) It is very strange that “correct deployment” parameter has no impact on time until resistance in mixtures.

In the plots the only effect was on mix1 that seemed to rise slightly.

in setExposure() this is how it is done :

if ( correctMixDeployProp < 1 )
{
  a['f','A','0'] <- (1 - correctMixDeployProp)/2 * exposure

  a['m','A','0'] <- a['f','A','0'] * maleExposureProp

  #m&f together
  a[,'0','B'] <- a[,'A','0']    
  a[,'0','0'] <- 1 - (a[,'A','B'] + a[,'A','0'] + a[,'0','B'])

}

Code seems as expected.

Could anything about how the sequential & adaptive mixture scenarios are patched together influence this ? I don't think so.

' @param correctMixDeployProp proportion of times that mixture is deployed correctly,

' assumes that when not deployed correctly the single insecticides are used instead

~ Thinking about correctMixDeployProp, how do we choose which of the 2 insecticides is used singly ? It is divided between them. Maybe this is the issue ? Perhps splitting between the 2 single applications in the same simulation has a similar effect as the 'mixture' ?

~ maybe write a function to produce a graphical output from setExposure() could put it into a UI to send to Ian. ~~ effectively it just needs to visualise the array a. ~~ I could firstly vis as a table rather than a graph ~~ outputs : 2 tables side by side, a["f",,] & a["m",,] https://andysouth.shinyapps.io/shinyNiche/

~ checking on correctMixDeployProp ~ added correctMixDeployProp into curtis shiny UI to enable checking mechanism ~~ first added correctMixDeployProp into runcurtis_f2

~ carefully implement strategy name changes Find & replace todo : in sensiAnPaper1All.Rmd & paper1_results_figs.Rmd

resistPointsMix2 resistPointsMix_A resistPointsMix3 resistPointsMix_2 resistBetterMix2SeqBoolean resistBetterMix_ASeqBoolean resistBetterMix2Seq resistBetterMix_ASeq

replacements done :

Mix1 - Mix_1 Mix2 - Mix_A Mix3 - Mix_2 "Mixture 1" - "Mix either" "Mixture 2" - "Mix adaptive" "Mixture 3" - "Mix both" done mix2_minus_seq0.25 mixA_minus_seq0.25

2/3/2016

done ~ record time spent & when was the last time I claimed on the previous contract ? Previous contract claimed 15 Sept 2015. submit invoice : from contract : maximum of 25 days at 217 per day I spent 30.6 days (can claim some against tsetse work, or from future contract ...) So time I spent in January on toggl should be on new contract/ To the end of Feb : Hours 105:45:00 Hours rounded 105.00 Days (/7.5) 14.00 Days remaining (from 85) 71.00

~ modifying PRCC mutliplots to get just one axis

3/3/2016

msg from Ian : Aha! I've understood what he means by setting correct_deploy to 0, to study mosaics. ~ changed order of strategies in legend ~ created new killer figure with added effectivenesses on the y & exposure on x.

8/3/2016 ~ add Curtis fig2 params to killer figure ~ tricky adding to legend or labelling ~ can I get legend for killer figure beneath ? ~ can I derive a line for the border between when seq & mix better by fitting a line to the 0 values ? ~ get the classification tree that matches my killer figure

9/3/2016 ~ creating 20% tree & plot ~ got the 0 line onto the first facet of the 20% plot

10/3/2016 ~ looking at the manuscript

I think the manuscript can be about describing development and results of a model that is a more generic version of Curtis. But we do not need to make the model useable by others at this stage.

11/3/2016 manuscript

current wordcount 9,300

Details from Plos Computational Biol : Software Submissions PLOS Computational Biology publishes articles describing outstanding open source software of exceptional importance that has been shown to provide new biological insights, either as a part of the software article, or published elsewhere.

The software must already be widely adopted, or have the promise of wide adoption by a broad community of users. Enhancements to existing published open source software will only be considered if those enhancements bring exceptional new capabilities. The software must be downloadable anonymously in source code form and licensed under an Open Source Initiative (OSI) compliant license. The source code must be accompanied with documentation on building and installing the software from source, as well as for using the software, including instructions on how a user can test the software on supplied test data. Software articles require a presubmission inquiry that includes explanations on how the above criteria are met.

Format Articles should be concise (less than 3500 words, not including supplementary material)

Sentence on how the decisoin tree level decided.

A rule of thumb is to choose the lowest level where the rel_error + xstd < xerror xerror is the cross-validation error

To avoid overfitting the number of levels was restricted to the minimum where the relative error plus the standard error was less than the cross-validation error.

added mix both to paper1_results_figs.Rmd (because of my slight uncertainty in mix adaptive method) repeated all figs for 50% resistance in paper1_results_figs_slimmed_50.Rmd

14/3/16

Curtis contentious statement p261 : "Example vishows the, at first sight unexpected, point that the use of a mixture where the initial gene frequencies are unequal leads to a more rapid increase in the rarer of the genes."

vi params ? exposure 0.9 strt freq loc1 0.01 strt freq loc2 0.001

modify fig2 by setting freq1 from 0.01 to 0.001

runcurtis_f2( P_1 = 0.001 , P_2 = 0.01 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 1 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43 , correctMixDeployProp = 1 )

when I changejust freq loc1 to 0.001 I get an error : Warning in min(which((combmat[, r1col]) > max(criticalPoints))) : no non-missing arguments to min; returning Inf

Trying to work it out with this run : runcurtis_f2( P_1 = 0.01 , P_2 = 0.003 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 1 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43 , correctMixDeployProp = 1 )

in plotcurtis_f2_generic problem is here :

cutoff_i1_mix <- min(which((combmat[,r1col])>max(criticalPoints))) cutoff_i2_mix <- min(which((combmat[,r2col])>max(criticalPoints)))

if (cutoff_i1_mix == Inf | cutoff_i2_mix == Inf) { warning("critical point not reached for i1 or i2: cutoff_i1_mix=",cutoff_i1_mix," cutoff_i2_mix=",cutoff_i2_mix) #don't add the combined(adaptive) strategy to the graph addCombinedStrategy <- FALSE }

in plotcurtis_f2_generic this calculation at the start seems to produce diff results than what they appear in the plot : maxGensMix <- findResistancePoints(combmat, locus='both', criticalPoints = criticalPoints)

maxGensMix [,1] gen_cP0.1 294 gen_cP0.25 999 gen_cP0.5 999

problem partly caused by max_gen default in run_curtis_f2 being 300

runcurtis_f2 <- function( max_gen = 300, whereas in sensiAnPaperPart max_gen is set to 500

probably I should set max_gen in run_curtis_f2 to 500 too so it is the same.

yes, now this works, but P_2 = 0.001 still fails to reach threshold runcurtis_f2( P_1 = 0.01 , P_2 = 0.002 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 1 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43 , correctMixDeployProp = 1 )

15/3/16 on train to liverpool

Looking at the form of the results from modified curtis fig2 plots : ~ in mixtures resistance in the final insecticide only rises after the 'first' one has become ineffective ~50%

~ in curtis scenario reducing start freq of I2 from 0.01 to 0.002 makes it take longer but doesn't really alter the form of curves ~ but then changing effectiveness of I2 from 1 to 0.95 makes a big difference. ~ aha! seems that an effectiveness of 1 in one insecticide stops resistance rising in the other one until ~50% resistance. Reducing effectiveness to even 0.95 (or just 0.99) seems to remove this effect. At effectiveness I2 0.99 resistance to I1 rises slowly until 50% is reached then it accelerates.

~ increasing dominance from 0.0016 to 0.02 doesn't radically alter curtis2 pattern

16/3/16 meeting Ian in liverpool

paper todo ~ rename fig x1 y axis to "time in generations" or time (generations) ~ rename y axes in fig x3 "time in generations" or time (generations) ~ rename maleExposureProp to male_exposure ~ rename correctMixDeployProp to correct_mix_deploy ~ set variable names to be the same as above in table 5 of manuscript ~ check again that correctMixDeploy is doing what we expect. Bit strange that in the plot against time-to-r the curves are completely level when in my curtis simulation it shows that decreasing the correct deploy does decrease time to resistance e.g. for mix2 from
done ~ change fig x3 to just exposure, effectiveness and dominance ~ add diff between seq and standard mixture to fig x4 ~ Ian will send me the params to rerun the line from table vi (which is just for 1 generation) ~ find the equation line of zero difference for the 0.5 threshold (and maybe for mix2 vs seq too ? )

~ for the 20% diff plot add the line that exactly 20% diff and keep colour scheme from killer plot ~ see df_20pc ~ how to make sure that the same points apear in each plot ?? maybe remove the facet column.

Qs for Ian ~ is effectiveness of 1 possible/likely ? The pattern in the curtis plot relies quite heavily on it. Reducing it to just 0.98 changes pattern considerably. ~ are we confident enough about the mix adaptive strategy to focus the paper on it ? Ian said yes. He has said before that it just takes a couple of generations for frequencies to return to Hardy Weinberg.

Remember that effectiveness is proportion of the SS genotypes that die when exposed.

Ian said, science is unsure what the effectiveness rates ofr insecticides are under field conditions. all sensitive mosquitos ar expected to die in a bottle assay. BUt under filed conditions where they briefly land on a net this may not be the case. experimental huts.

IVCC are apparently asking if we could model 3 insecticides.

Ian keen to keep sex-linkage & niches in. Say we have developed a general model and say we will explore more behaviour in future ...

~ asked Ian where could I submit a paper explaining this resistance thing in simpler terms for dummies. If I want to get into anything sciency a first author paper would really help me out. It could concentrate just on the mechanism of how the model works rather than on the sensitivity outputs. It could have plot combinations, showing how the curves change in response to 1 param changing. I could start writing it now. It could go to ecological modelling or similar epidemiology journal (maybe ask Jo Lines). ~~ concise ~~ focus on mechanisms ~~ for non expert audience ~~ dominance, exposure ~~ draft something ~~ ian suggested malaria journal ~~ paper2_resistance_mechanisms_mixtures.Rmd

22/3/16 comments back from Ian on paper1

One other thing for your "to do" list. I have made the point that Fig 4 (your x2) analyses the data summarised in Fig 3 (your x1).. which helps the flow of the paper. But we only present PRCC for 4 of the data shown in Figure 3 so it may worthwhile doing PRCC for the missing one i.e. time to resistance to first insecticide in sequential use. I assume that would be relatively easy to do and automate in the file?

Fix x1 I did Time to resistance (gens)~ Y axis label change to "Time (gens)" done ~ X axis labels as in the figure legend i.e. "Mix 1st" "Mix both", "Seq 1st" and "Seq both". I think "both" is better than "second" as it avoids any potential confusion between insecticide labelled 1 and 2. done ~ I think we decided to re-labelled "sole use" to "Seq 1st" for clarity done ~ You might also like to change "Adaptive mix" to just "Adaptive" to try and save some space which I think will become important later as graphs are scaled down to their published size.

done ~ Fig x2.. can we make the labels identical to those above i.e., in order "Seq both" "Mix 1st" "Adaptive both " "Mix both".... A reviewer will wonder why we don't do all 5 outputs described in Fig x1 i.e. whey we omit "Seq 1st" We can make it clear we are looking at the most likely application strategies so we only show those when resistance reached for both. (largely it's a space thing and the others don't show anything different).

Fig x3...Y axis label change to "Time (gens)".

Fig x5 and x6.... Can we change the Y axis label to just "Effectiveness of insecticides"... drop the word "added" as it is a bit ugly, makes the label longer and we now say it the sum of effectiveness in the figure legend. Can we change the X axis label to just "exposure"... do we need to say it is for "both insecticides"????

23/3/16 email from Ian

We extensively look at how best to deploy 2 new insecticides, but not how to deploy one new insecticide when several others are in use and there is resistance to them….

This Curtis argument was not about sequential vs mixtures (i.e. where we start off with 2 insecticides and decide how best to deploy them). It is simply this questions: “If I have a single new insecticide where resistance is presumably present at a very low frequency, should I deploy it on its own (sole use), or in combination with an existing insecticide whose resistance levels are high”;

Curtis defines allele B as the one encoding resistance to the new insecticide and which therefore has a lower starting frequency and asserted that, under these circumstances, deploying the new insecticide as a mixture would be counterproductive for allele B i.e. would drive resistance more quickly than when insecticide B was deployed alone.

Can I show this by modifying the fig2 plot ? Although probably we would need to rerun a sensitivity type anlysis if we wanted to address this properly ...

curtis f2 runcurtis_f2( P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 1 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43 )

curtis f2 with i1 set to high starting freq (0.5 breaks plot slightly, but still seems to show that resistance for i2 takes longer in mixture) runcurtis_f2( P_1 = 0.5 , P_2 = 0.01 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 1 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43 )

curtis_f2 with P_1 changed from 0.01 to 0.45 : mix slower than seq runcurtis_f2( P_1 = 0.45 , P_2 = 0.01 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 1 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43, addCombinedStrategy=FALSE )

curtis_f2 with P_2 changed from 0.01 to 0.45 : mix slower than seq runcurtis_f2( P_1 = 0.01 , P_2 = 0.45 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 1 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43, addCombinedStrategy=FALSE )

similar with P_2 0.3 runcurtis_f2( P_1 = 0.01 , P_2 = 0.3 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 1 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43, addCombinedStrategy=FALSE )

but what about the params for curtis_tab1_ex_iv ? Beths analysis table uses the same params I think.

Hi Ian,

As a first step I've added an extra fig x8 onto the end of the figures document attached and in dropbox (I started to address figure tweaks, but have held fire on that while we sort this). Fig x8 addresses whether to keep using an old insecticide when a new one is introduced. It just increases the starting frequency of each resistance in turn to just below the threshold (0.49). I think this accomplishes the same as Beths table 4.

Would I be able to alter the parameters to recreate Beths table3 and Curtis table 1 example vi in a similar way ? I couldn't quite work out which inputs to change.

How would we investigate this in a sensitivity analysis ? Essentially we could repeat what we've done already but with unequal starting frequencies. We had the issue before of not wanting one frequency to be dependent on the other. So we could simply set P_1 to a low range as in current analysis, and then have P_2 at a higher range (e.g. 0.3-0.5). Or perhaps we don't need a range for the 'old' insecticide we could just set at 0.5 ?

I'm slightly nervous about running more analyses but I could investigate this with a couple of days work. Then we would need to decide how to squeeze into the paper.

Happy to chat,

Andy

previous current state of play 23/3/16

sensiAnPaper1All.Rmd : code to run whole sensitivity analysis & produce plots
sensiAnPaperPart.r : sets up param values and runs for one treatment (e.g. mixture, insecticide1)
runModel2.r : runs model for the passed scenarios
paper1_results_figs_slimmed50.Rmd : final figures for paper

What would I need to do ?

sensiAnPaperPart.r : set new param ranges for P_2, perhaps add an argument to enable running them
sensiAnPaper1All.Rmd : rerun calling new version of sensiAnPaperPart, save results to something like : listOutMix_unequal_10000.rda
paper1_results_figs_slimmed50.Rmd : save with new name, just rerun using the new results files

perhaps do for 500 runs first ?

reran new unequal scenarios with start_freq set to constant 0.5. first plots OK, but then get a weird thing where mixa_minus_seq0.5 is always -2. So mix adapative is always taking 2 generations less than sequential. I suspect this may be an artefact of detecting resistance thresholds when the threshold is already reached for i2.

~ created pap1_figs_slimmed_50_unequal.Rmd ~ all changes marked with #4unequal

Seems that resistance is slower for mixtures in all replicates. For the 500 runs at least.

23/3/16 email from Ian

I have been racking my brain to think if we can address this question using data that have already been generated (and hence avoiding having to explain how we generated new data) . Rather than trying to explain it informally in an Email, then write it up, I went directly to step 2 and wrote the following as a draft paragraph for the ms…. Does it make sense???

“We can analyse a subset of the sensitivity analysis to investigate this specific question of whether single use of a new insecticide is better than deploying it as a mixture with a pre-existing insecticide which already has some resistance present in the population. We selected runs where the starting frequency of resistance to insecticide 2 was > 10 fold higher than frequency of resistance to insecticide 1. We then compared the time for insecticide 1 resistance to reach 50% when deployed on its own, compared to time for resistance to insecticide 1 to reach 50% when deployed as a mixture with the pre-existing insecticide. As in the full analysis, we compute the ratio mixture/single for each run and generate a classification tree to see what circumstances favour what policy and, in particular whether the ratio of resistance starting frequencies has a large impact”

I am assuming insecticide #1 is always deployed first in the sequential runs so the time to its resistance reaching 50% is already recorded. I am also assuming that in the mixture runs you have recorded the time until resistance to each insecticide reaches 50% so you always have the data #1.

this is the important bit from Ian : We selected runs where the starting frequency of resistance to insecticide 2 was > 10 fold higher than frequency of resistance to insecticide 1. We then compared the time for insecticide 1 resistance to reach 50% when deployed on its own, compared to time for resistance to insecticide 1 to reach 50% when deployed as a mixture with the pre-existing insecticide.

I think it would be easier to do this by new runs, which is close to what what I've done in pap1_figs_slimmed_50_unequal. (although I might want to change the freq of the 'old' insecticide from a constant 0.5 to a range e.g. 0.2 - 0.5, but then would).

Could I practically do what Ian has suggested ? How many runs are there where the start freq of i2 is > 10 * i1 ? Where resist_start_1_div_2 < 0.1

see in paper1_results_figs.rmd

note my hi_div_lo was actually lo_div_hi, corrected in the figs_slimmed_50.rmd

resist_start_hi_div_lo <- ifelse( treeInput['P_1',] < treeInput['P_2',], treeInput['P_1',]/treeInput['P_2',], treeInput['P_2',]/treeInput['P_1',])

aha! had a thought. Can I highlight those runs in the killer plot where resist_start_1_div_2 < 0.1 ? probably need to do it for the mix2 graph rather than mixadaptive.

I tried and it didn't show the split I expected perhaps because the higher start resist isn't high enough.

29/3/16

spoke to Ian on the phone. He's still keener on reanalysing the current sensi results. Because he said its more relevant where the existing insecticide has a resistance level of 5-10% (or even less).

probably similar to : findResistancePointsMixResponsive()

Effectively I think this is just comparing mixAdaptive with mixBoth for the subset of scenarios.

Aha no actually isn't it comparing sole use with mixture 'new' insecticide, the different bit will be that its not quite the same as the current mix1 & mix2. Because I don't think we care about the resistance level of the 'old' insecticide.

Maybe 1 create a function to calc when resistance is reached for new+old 2 create a 2nd func to calc when resistance reached for new on its own

1) findResistancePointsOldPlusNew <- function( listOutMix, listOutI1, listOutI2 )

~ find whether I1 or I2 has lower starting freq this is the 'new' insecticide ~ if new==I1 get from mix time til resistance for I1 ~ if new==I2 get from mix time til resistance for I2

2) findResistancePointsNewOnly <- function( listOutI1, listOutI2 )

~ find whether I1 or I2 has lower starting freq this is the 'new' insecticide ~ if new==I1 get from I1 time til resistance for I1 ~ if new==I2 get from I2 time til resistance for I2

For each get the outputs as

then create a dataframe : dif_mixnew_sole

30/3/16

~ rather than doing as above I can use the existing files for I1 & I2 resistPointsI1 <- findResistancePoints(listOutI1, locus=1) resistPointsI2 <- findResistancePoints(listOutI2, locus=2) ~ & then calc resistPointsMixI1 <- findResistancePoints(listOutMix, locus=1) resistPointsMixI2 <- findResistancePoints(listOutMix, locus=2)

Then post-process from these files for when starting freq of I2 >10*I1

A future thought. If I bit the bullet and allowed the simulation to represent adaptive strategies things might be much easier ? Also in future could I allow the effectiveness of insecticides to change over time, similar to how we have done in the game ?

doing this in paper1_results_figs.Rmd

now have : dif_oldnew_newonly["mix_minus_sole0.5"]

BUT this is still for all runs (i.e. not just the ones where start freq of the old is > 10 * the new)

11/4/2016

I think I want to plot this against each input variable to check whats going on, i.e. where is the mix better than sole use.

dif_oldnew_newonly["mix_minus_sole0.5"]

see Fig xx3

In most (70%) runs mixture are better, but that does leave 30% where sole better.

Eyeballing the plots suggests that exposure is still the most influential input. At high exposures the benefit of the mixture is less.

May be something weird going on with start_freq_allele2, noticeable lack of runs where mix-sole < 0 as start_freq_allele2 gets higher ...

Are the inputs relating to allele1 & 2 still meaningfull with the post-processing that I do ?

No. aha! problem I was having is that I1 & I2 inputs are not swapped around.

#might be better just to restrict to the runs in which I2 < I1 resistPointsNewOnly <- resistPointsI2_T[indicesI2_lessthan_I1, ] resistPointsOldPlusNew <- resistPointsMixI2_T[indicesI2_lessthan_I1, ]

when I decided to restrict to runs where startfreqI2 < I1 mix seems to be better all the time ??

? am I doing something wrong ? seems not

Eyeballing fig xx3

when I1 is the old : Most important inputs & their effect on the benefit of mixture are : exposure : reduces benefit of mixture effectiveness i1,old : increases benefit of mixture dominance i2,new : reduces benefit of the mixture selection coefficient i2 : reducces benefit of the mixture

My thoughts on mechanisms. The mixture does better because the old insecticide kills some mosquitoes that have developed resistance to the new, thus reducing rate of increase in resistance to the new. Exposure. Low exposure reduces the benefits of mixtures. This could be because the beneficial effect of the old insecticide is reduced. Effectiveness of the old insecticide. High effectiveness increases the benefit of the mixture because more of the mosquitoes that have developed resistance to the new insecticide are killed. Dominance. High dominance of the new insecticide allele decreases the benefit of the mixture. High dominance in resistance to the new insecticide should allow more mosquitoes to escape the new insecticide therefore I would expect the benefit of the mixture to be greater. But its the other way around.

I probably need to get straighter in my head how dominance works. Does dominance effect the relative performance of the 2 insecticides where mosquitos are exposed to both. i.e. Does high dominance of the new allele mean that a mosquito is more likely to survive the presence of the old insecticide ?

To understand the mechanisms more I could look at the effects of the inputs on each strategy on its own.

These results are consistent with previous ones.

~ ran PRCC on the new results (confirmed same important params)

copied new bits into keep_old_insecticide.Rmd and sent the pdf to Ian

12/04/2016

Ian was surprised that mixtures better in all runs.

He suggested that The Curtis eg where sole better might be strange and not fall within our parameter space.

~ check whether I can recreate some results from the new analysis by putting the params into the UI, or better by calling runcurtis_f2()

e.g. first run in dif_oldnew_newonly mix : 180 sole : 48 difference : 132

a previous example

runcurtis_f2( P_1 = 0.01 , P_2 = 0.3 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 1 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43, addCombinedStrategy=FALSE )

putting params of 1st run in

runcurtis_f2( P_1 = 0.0159762192 , P_2 = 0.0076494164 , h.RS1_A0 = 0.660797792 , h.RS2_0B = 0.629114044 , exposure = 0.5582827 , phi.SS1_A0 = 0.9441143 , phi.SS2_0B = 0.9695714 , s.RR1_A0 = 0.1216252 , s.RR2_0B = 0.1720911, addCombinedStrategy=FALSE )

gives warning : critical point not reached for i1 or i2: cutoff_i1_mix=Inf cutoff_i2_mix=Inf

is this because ?? : 1. I've pasted wrong no 2. an indexing issue means that inputs and results in the table are misaligned

what i'm running in these new runs is different from whats done in the plot don't think so
extra defaults no passed to the plot are different from those used in sensi runs ***aha! could be maleExposureProp and correctMixDeployProp

Looking at the plot sole use does look like it takes about 48 as in the results, but for I2 in the mix looks like it should take ~600 rather than 180 as suggested in the results.

New corrected version :

YES this does now match. Hurrah! (also shows that maleExposureProp and/or correctMixDeployProp in isolation can have a big effect that I could explore in the mechanisms paper).

Should I do for another result too, just to confirm ?

Hi Ian, I would like to see if I can recreate Curtis Table 1, (vi).

It makes my head hurt trying to work out what our inputs should be based on his table.

Could you help me out by filling in the below * ? (or is it not possible and I'm missing something?).

start frequencies of each resistance allele

P_1 = *, P_2 = * ,

dominances

h.RS1_A0 = , h.RS2_0B = ,

exposures

exposure = *,

effectivenesses

phi.SS1_A0 = , phi.SS2_0B = ,

selection coefficients

s.RR1_A0 = *, s.RR2_0B = *

Here are the equivalent values used to recreate his figure 2.

start frequencies of each resistance allele

P_1 = 0.01 , P_2 = 0.01 ,

dominances

h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 ,

exposures

exposure = 0.9 ,

effectivenesses

phi.SS1_A0 = 0.73 , phi.SS2_0B = 1 ,

selection coefficients

s.RR1_A0 = 0.23 , s.RR2_0B = 0.43

cheers, Andy

Back from Ian :

P_1 = 0.01, P_2 = 0.001,

dominances

h.RS1_A0 = 1, h.RS2_0B = 1,

exposures

exposure = 0.9,

effectivenesses

phi.SS1_A0 = 1, phi.SS2_0B = 1,

selection coefficients

s.RR1_A0 = 1, s.RR2_0B = 1 )

runcurtis_f2( P_1 = 0.01, P_2 = 0.001,

dominances

h.RS1_A0 = 1, h.RS2_0B = 1,

exposures

exposure = 0.9,

effectivenesses

phi.SS1_A0 = 1, phi.SS2_0B = 1,

selection coefficients

s.RR1_A0 = 1, s.RR2_0B = 1, addCombinedStrategy=FALSE )

This seems to generate a very rapid rise in resistance.

and generrates this error in the graph : Error in xy.coords(x, y) : 'x' and 'y' lengths differ addCombinedStrategy=FALSE fixes the error

i think I should switch i1 & 2 frequencies

runcurtis_f2( P_1 = 0.001, P_2 = 0.01,

dominances

h.RS1_A0 = 1, h.RS2_0B = 1,

exposures

exposure = 0.9,

effectivenesses

phi.SS1_A0 = 1, phi.SS2_0B = 1,

selection coefficients

s.RR1_A0 = 1, s.RR2_0B = 1, addCombinedStrategy=FALSE, addStrategyLabels = FALSE )

This seems to show that resistance to the new insecticide (red) rises faster when alone (dotted) than when in combination (solid) with an old insecticide at a higher frequency. This is contrary to what Curtis said.

Also note that resistance arises much faster than in our sensitivity runs. I think this is because the selection coefficients (1) are much higher than the range we use (0.1-0.45).

15/4/16

from Ian : I have re-read Curtis again and its that paragraph on page 261 that is causing all the problems i.e.

“Example vi shows the, at first unexpected……”

I guess the key phrase is what he means by “more rapid”…I assume he means more rapid than using it on its own. But that doesn’t tally with your figure. His example is only a 1-genertion argument so I should be able to do some simple arithmetic. Can you tell me what the allele frequencies were at generation 2 of your plot and I’ll check against my arithmetic.

to get allele freqs at gen 2 :

tst$results[[2]][1:8,]

18/4/16

email from Ian Morning Andy,

here is my logic, honed by long walks along the beach

(1) According to your analysis of a subset of the sensitivity analysis i.e. where the “new” insecticide is present at lower frequency…….it always lasts longer under a mixture (presumably this time is defined as that taken for the resistance allele to reach 50%)

(2) But is some of the runs in the sensitivity analysis, sequential use is favoured as it maximises the time for both resistance alleles to reach 50%. Given the result in (1) this can only occur if , in some runs, the “old” insecticide with higher frequency of resistance spreads faster under a mixture.

Does this make sense? Can you check if it is true?

Ian

~ in runs where resistance arises slowest for sequential does resistance to the the old (higher freq) insecticide arise faster in the presence of the new insecticide than without it ?

~ look at an example of a run where sequential is 'best'

Hi Ian, Good question. I'm somewhat jet-lagged and a little hungover in a seminar room In Seattle so not thinking at my best but I'll try a quick shot which may be wrong.

The example below is the first one I could find where resistance arises slower for sequential. (it happens to be where the starting frequencies are equal).

In this case it seems that sequential is slower because resistance to insecticide2 (blue) only starts increasing when resistance to the first insecticide has been reached. The curve itself is steeper in the absence of the other insecticide (compare blue dashed to solid) but because it starts later it gets to the threshold later.

I'll try to look for another example.

Andy

sequential best

runcurtis_f2( P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.63 , phi.SS2_0B = 0.85 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43 )

relative to curtis fig2

runcurtis_f2( P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.73 , phi.SS2_0B = 1 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43 )

sequential best. effectivenesses bothg reduced by 0.2 from curtis fig2

runcurtis_f2( P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 0.9 , phi.SS1_A0 = 0.53 , phi.SS2_0B = 0.8 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43, addCombinedStrategy=FALSE )

~ looking at my killer plot gave me a clue how to set the parameter ranges for when sequential should be better. Low effectiveness and high exposure.

trying to make sequential better, high exposure, low phi

runcurtis_f2( P_1 = 0.05 , P_2 = 0.01 , h.RS1_A0 = 0.17 , h.RS2_0B = 0.0016 , exposure = 1 , phi.SS1_A0 = 0.1 , phi.SS2_0B = 0.1 , s.RR1_A0 = 0.23 , s.RR2_0B = 0.43 )

FRom Ian web version gives an error with certain param combinations

runcurtis_f2( P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 1 , h.RS2_0B = 1 , exposure = 0.9 , phi.SS1_A0 = 1 , phi.SS2_0B = 1 , s.RR1_A0 = 1 , s.RR2_0B = 1, addCombinedStrategy=TRUE ) Error in xy.coords(x, y) : 'x' and 'y' lengths differ

no error when I set addCombinedStrategy=FALSE, but it doesn't plot a curve for i1 mixture ?? or maybe that i1 mix line is behind i2 mix line. Setting a transparency could solve that.

runcurtis_f2( P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 1 , h.RS2_0B = 1 , exposure = 0.9 , phi.SS1_A0 = 1 , phi.SS2_0B = 1 , s.RR1_A0 = 1 , s.RR2_0B = 1, addCombinedStrategy=FALSE )

where is the error ?

in plotcurtis_f2_generic.r

lines( gens, comb, col="red", lty=3 )

because comb contains nothing.

27/4/16

email from Ian : I agree entirely with your analysis of why sequential sometimes lasts longer than mixtures (see below in case you have forgotten!). I updated the ms to reflect this and tried to get a coherent discussion of the Curtis results. See new text on pages 16 (paragraph starting “We next investigated the statement by Curtis “) and on page 19 (paragraph starting “The secondary objective was to apply”).

As you will see I put my head on the block (page 16 when I said) “Finally, for future use, we analysed the other sub-set of the sensitivity data i.e. when the first insecticide to be deployed in the sequence was the one with the higher frequency of resistance and determined whether resistance to that insecticide spread faster when it was deployed on its own or as part of a mixture. Again, in all cases, resistance spread faster when the insecticide was deployed on its own. In summary then, resistance to an insecticide always spreads faster when that insecticide is deployed alone rather than as part of a mixture, irrespective of the relative frequencies of resistance to the two insecticides.“

Can you check if that is true.

Secondly, a minor point, but it may become important when determining number of generation for resistance to reach 50%... do you number the initial generation zero so that subsequent generation numbers reflect the elapsed number of generation of selection? e.g. Figure 10 looks like the initial generation is #1. Fig 12 looks like its starts from 0 but the scale is small

I had immense problem understanding Table 3 which came from a very early version. I have checked back in my archives but have no previous instnaces of it. It makes no sense to me to and it is not compatible with the dynamics shown on Figure 10 (see note on ms). For example I cannot understand why time to resistance under a mix is not affected by the starting frequency at locus A. If you cannot understand it either, can you re-derive the results and correct it. Thanks.

Table 4 looks sensible to me but if you are rechecking table 3, perhaps you could check a couple of values in table 4 just to make sure nothing unexpected has happened.

Like I said, happy to Skype if you want to discuss further. Once we get this bit finalised, we can circulate to the Curtis-fans. Like I explain on page 19, I think they may have misinterpreted the comparison he was trying to make.

~~ I checked tables 3 & 4, they were Beths so I sent an email to Ian asking him to ask her

28/4/16

https://www.bayer.co.za/en/bayer-develops-the-first-two-way-insecticide-mixture-for-indoor-residual-spraying.php

Consistent with its mission of delivering ‘Science For A Better Life’, Bayer has recently submitted a dossier to the World Health Organization Pesticide Evaluation Scheme (WHOPES) for the evaluation of a new two-way insecticide mixture which includes a new mode-of-action for indoor residual spraying (IRS) against disease vectors. Named Fludora™ Fusion, this first IRS based on two active ingredients is intended to provide an effective solution to help African disease control programs address the challenge of insecticide resistance in malaria-transmitting mosquitoes. Field testing of the product has shown excellent results against many different kinds of resistant mosquitoes and strong performance across a wide range of surfaces. Bayer foresees the WHOPES evaluation and testing process to take about 2 years and anticipates market availability of the product by the end of 2017.

This has some good stuff about IRM strategies : http://www.slideshare.net/srinivasnaik52643/insecticide-resistance-management-strategy

Curtis et al 1993, potential IRM strategies : Georghiou1994 potential IRM strategies : ~ moderation : preserve susceptible genes, e.g. reduce mortality of SS by low dose or refugia ~ saturation : high dose so that heterozygous resistants are killed ~ multiple attack : independently acting pressures neither of which is strong enough to lead to resistance e.g. mixture rotation

Mani GS. Evolution of resistance in the presence of two insecticides. Genetics 1985;109(4):761–83.

Abstract A two-locus model is used to analyze the effectiveness of a mixture of insecticides in delaying resistance, compared to the use of the insecticides singly. The effects of factors such as recombination, effective dominance, initial value of allele frequencies and initial value of linkage disequilibrium are considered. It is shown that the use of mixtures is always more effective in delaying the onset of resistance, often by many orders of magnitude. It is shown that there exists a threshold value of recombination fraction, above which the evolution of resistance is extremely slow. Resistance evolves very rapidly for values of recombination fraction below the threshold. Finally, the relevance of these results on resistance management is discussed.

~ is mixture always better than sole ? Yes see this in paper1_results_figs.Rmd : mix_minus_soleI1 <- resistPointsMixI1_T["gen_cP0.5"] - resistPointsI1_T["gen_cP0.5"] which( mix_minus_soleI1 < 0 ) #0 mix_minus_soleI2 <- resistPointsMixI2_T["gen_cP0.5"] - resistPointsI2_T["gen_cP0.5"] which( mix_minus_soleI2 < 0 )

17/5/16

Update Report for Ian :

Tsetse : 15 days remaining : defined things to do for Steve Resistance : 55 days (from 85) remaining

done ~ set up a new UI allowing comparison of 2 model runs side by side (like my curtis one but be able to modify params for both plots)

55 + 15 days = 70.

at 5 days a week 70/5 = 14 weeks

4 days per week

ISSF : institutional strengthening fund, Matt Powney.

Birget 2015

Methods: population-genetic model of the spread of insecticide resistance in response to insecticides used either as adulticides (ITNs) or as larvicides (malaria control or agriculture).

Results: We show that indoor use of insecticides leads to less selection pressure than their use as larvicides.

Reasons for relatively low selection pressure by adulticides (i) males are not affected by the ITNs (ii) insecticides are also repellents, keeping mosquitoes at bay from contacting the insecticide but also driving them to bite either people who do not use the insecticide or alternative hosts.

Barbosa and Hastings [12] use a more complex formulation by including the proportion of houses that are covered by the bed nets. However, models that make mosquito fitness dependent on its behaviour and life-history provide significant advantages over others as they allow integration of knowledge of medical entomologists with the population genetics of the model. This approach has been followed by a number of authors [14, 15, 16].

For example, Koella et al. [16] combined a population genetic approach with aspects of the mosquito’s feeding cycle to calculate ”effective coverage”, the proportion of mosquitoes killed by the insecticide during a single gonotrophic cycle. We extended this approach by formulating a population genetic model that calculates exposure rates from the mosquito’s feeding cycle similarly to the model described by Le Menach et al. [17]. In doing so, we propose behaviourally and epidemiologically based fitness functions that help us to understand more fully the predictions of the genetic model.

Andrew F. Read, Penelope A. Lynch, and Matthew B. Thomas. How to make evolution-proof insecticides for malaria control. PLoS Biology, 7(4):e58, 2009. doi: 10.1371/journal.pbio.1000058. URL http://dx.doi.org/10.1371/journal.pbio.1000058.

Read 2009 : Late-life-acting insecticides could control infective mosquitos while not depressing lifetime reproductive success sufficiently to select for resistance.

23/5/16 liverpool, meeting IVCC Ian & Matt Powney

IVCC were very impressed with the results of Ians modelling and my new simple UI presented at the end.

Unprompted Tom Mclean suggested that we could develop a more accessible version of the paper to get out to vector control people. I said that I had started the same 2 weeks ago !

Tom said he liked the sliders on my new UI. They were keen that they might be able to use such a UI to talk through AI formulation options with their suppliers.

24/5/16 5 hour train back from liverpool

~ looking at plotallele.freq.andy() for plotting responses to single insecticides

probably make a new more flexible version.

I want to be able to plot multiple scenarios but just a single loc & insecticide i think.

added sex & locus selection options to plotallele.freq.andy()

25/5/16 tidying these notes

looking how to get frequency by generation data into a format suitable for ggplot to enable colouring, faceting etc.

1/6/16

good progress on get_resistance()

2/6/16 ~ stopped runModel2 now showing plots as default ~ got examples working for get_resistance()

progress on adding single insecticide plots to paper2

Why does putting effectiveness down to 0.3 or even 0.4 give fitness vals > 1

a <- setExposure(exposure=0.5, insecticideUsed='insecticide1')

i1 <- setInputOneScenario( max_gen = 500, h.RS1_A0 = 0.5, h.RS2_0B = 0.5, a = a, phi.SS1_A0 = 0.3, phi.SS2_0B = 0.5, s.RR1_A0 = 0.5, s.RR2_0B = 0.5 ) listOut <- runModel2( i1 )

Warning messages:

1: In runModel2(i1) : 1 locus fitness values (Wloci) are >1 : 1.2

2: In runModel2(i1) : 7 niche fitness values (Wniche) are >1

3: In runModel2(i1) : 6 individual fitness values (Wloci) are >1

Principally setting selective advantage down to 0.3 seems to solve. Maybe 0.3 should be my default value for all params ?

problem with default of 0.3 for effectiveness is that then get warning for higher selective advantage.

can't quite get the combinations to work.

perhaps have effectiveness range as 0.5, 0.7 & 0.9 (& all others as 0.3, 0.5, 0.7) ? & default of 0.5 for all.

single insecticide plots plotted on a log scale to ease comparison with double insecticide plots.

3/6/16

can I make my resistance app fit on mobile ? currently columns get put underneath each other for both plots & sliders.

https://andysouth.shinyapps.io/MixSeqResist1

http://stackoverflow.com/questions/29179088/flexible-width-and-height-of-pre-rendered-image-using-shiny-and-renderimage

Figured it out myself. As far as I know, there is no way to handle this in R. Instead, use your style sheet (e.g., bootstrap.css) by adding something like this:

img { border: 1; max-width: 100%; } element.style { width: 33.33%; }

http://shiny.rstudio.com/articles/layout-guide.html

Responsive Layout The Bootstrap grid system supports responsive CSS, which enables your application to automatically adapt its layout for viewing on different sized devices. Responsive layout includes the following:

Modifying the width of columns in the grid Stack elements instead of float wherever necessary Resize headings and text to be more appropriate for devices Responsive layout is enabled by default for all Shiny page types. To disable responsive layout you should pass responsive = FALSE to the fluidPage() or fixedPage() function.

Supported Devices

When responsive layout is enabled here is how the Bootstrap grid system adapts to various devices:

                Layout width      Column width  Gutter width

Large display 1200px and up 70px 30px Default 980px and up 60px 20px Portrait tablets 768px and above 42px 20px Phones to tablets 767px and below Fluid (no fixed widths) Phones 480px and below Fluid (no fixed widths)

Note that on smaller screen sizes fluid columns widths are used automatically even if the page uses fixed grid layout.

This suggests that maybe I should write a phone app as fluidPage to help me see on PC what it will look like on phone ?

Now I have changed to fluid. What happen is that when columns are forced to be less than a minimum size they stack rather than going side by side. Can I modify this minimum size ?

http://www.w3schools.com/css/css_rwd_mediaqueries.asp Column widths are switching to 100% when screen size gets below a certain level. I want to stop that! Perhaps I just want to stop it being responsive ? But that option has been deprecated, doh!

This code stops columns being resized to 100%, but the thinner columns are still getting stacked ??

#can add CSS controls in here #http://shiny.rstudio.com/articles/css.html #http://www.w3schools.com/css/css_rwd_mediaqueries.asp tags$head( tags$style(HTML(" .col-sm-1 {width: 8.33%;} .col-sm-2 {width: 16.66%;} .col-sm-3 {width: 25%;} .col-sm-4 {width: 33.33%;} .col-sm-5 {width: 41.66%;} .col-sm-6 {width: 50%;} .col-sm-7 {width: 58.33%;} .col-sm-8 {width: 66.66%;} .col-sm-9 {width: 75%;} .col-sm-10 {width: 83.33%;} .col-sm-11 {width: 91.66%;} .col-sm-12 {width: 100%;}

                "))
),

I think I want to stop the container-fluid from stacking

float: left; for side-by-side float: none; That makes your columns to render one below the other.

aha! just adding float: left; like this seems to fix it

                .col-sm-1 {width: 8.33%; float: left;}
                .col-sm-2 {width: 16.66%; float: left;}
                .col-sm-3 {width: 25%; float: left;}
                .col-sm-4 {width: 33.33%; float: left;}
                .col-sm-5 {width: 41.66%; float: left;}
                .col-sm-6 {width: 50%;  float: left;}
                .col-sm-7 {width: 58.33%; float: left;}
                .col-sm-8 {width: 66.66%; float: left;}
                .col-sm-9 {width: 75%; float: left;}
                .col-sm-10 {width: 83.33%; float: left;}
                .col-sm-11 {width: 91.66%; float: left;}
                .col-sm-12 {width: 100%; float: left;}

bits taken out that didn't seem to do anything

                box-sizing: border-box;

                .row::after {
                content: '';
                  clear: both;
                  display: block;
                }

7/6/16 IVCC & Avecnet meetings in Manchester

Showed my phone model to : Justin McBeath, Bayer Mark Hoppe, Syngenta Ellie Sherrard-Smith, Imperial see notes in the yellow notebook

10/6/16 ~ work out why warnings about fitness being too high or low are generated.

under what inputs does it happen :

a <- setExposure(exposure=0.5, insecticideUsed='insecticide1')

i1 <- setInputOneScenario( max_gen = 500, h.RS1_A0 = 0.5, h.RS2_0B = 0.5, a = a, phi.SS1_A0 = 0.3, phi.SS2_0B = 0.5, s.RR1_A0 = 0.5, s.RR2_0B = 0.5 )

listOut <- runModel2( i1 )

Warning messages: 1: In runModel2(i1) : 1 locus fitness values (Wloci) are >1 : 1.2 2: In runModel2(i1) : 7 niche fitness values (Wniche) are >1 3: In runModel2(i1) : 6 individual fitness values (Wloci) are >1

This is one effectiveness set to 0.3, all other inputs at 0.5.

effectiveness locus fitness val (Wloci) 0.3 1.2 0.4 1.1

occurs around line 240 in runModel2()

The problem is for RR for the insecticide with lower effectiveness :

Wloci exposure loci no lo hi SS1 1 1 0.60 RS1 1 1 0.85 RR1 1 1 1.10 SS2 1 1 0.50 RS2 1 1 0.75 RR2 1 1 1.00

problem is specifically with this bit of code :

  for( exposure in c('lo','hi') )
  {
    ...

    Wloci[ paste0('RR',locusNum), exposure] <- (1 - phi[locusNum, exposure]) + 
                                               (s[locusNum, exposure])
  }

fitness of RR = (1-effectiveness) + selection coefficient

fitness = (1-0.4) + 0.5 = 1.1

therefore generates warning about being > 1

Is this an issue with something that used to be normalised to 1 ? Where was in Beths code : runModel() ?

# high levels of insecticide A
W.SS1_A0 <- 1 - phi.SS1_A0
W.RS1_A0 <- W.SS1_A0 + (h.RS1_A0 * s.RR1_A0)
W.RR1_A0 <- W.SS1_A0 + s.RR1_A0

This is RR in Table2 of the manuscript.

at line 206, doesn't seem that there was normalisation afterwards, maybe we should have ?

What do i think should happen ?

From the paper : The second step is to define the fitness of the different genotypes (Table 2) ; these are all scaled relative to the fully sensitive (wildtype number of offspring left by the genotype and may be reduced by decreased viability) mosquito unexposed to insecticide (as in a previous single-locus model of resistance evolution [11]) whose fitness is denoted 1.

Hi Ian,

Trust you had a good time in Oxford.

The IVCC & AvecNet meetings were really good, I think both Matt & I got a lot out of them. Mixtures came up a lot.

Quick think I've noticed, that I thought I better write down before I forget.

If we do a model run with effectiveness set to 0.4 & selection coefficient set to 0.5, the model generates warnings about fitnesses being greater than 1. This is because the initial locus fitness of RR ends up > 1.

locus fitness of RR = (1-effectiveness) + selection coefficient

fitness = (1-0.4) + 0.5 = 1.1

This seems to be as specified in the RR row of Table 2 in the manuscript.

What should we do about this ? a) normalise to stop this happening b) stop the user running for such param combinations c) modify the locus fitness calculation

No rush, we can look at next week.

Have a good weekend, Andy

Reply from Ian : Hi Andy

Please don’t scare me like this!!

I assume the problem came up in your stand-alone models where users can input their own values.

If it came up in the sensitivity analysis for the paper we are in deep sh**. If you look at table 4 and at the text on page 12, you will see that we calculate a value phi that would generate a RR fitness of 1.0 and our sensitivity analysis is then 0.2->1 times phi; this is to prevent any RR fitness going above 1. Please, please tell me this was implemented in the sensitivity analysis!

Assuming the problem only arises in user-defined runs I would print them out a warning explaining that fitness of RR will exceed 1, and maybe provide them with the maximum value it can take i.e. calculate phi for them

Incidentally I have altered the description of the sensitivity analysis on page 12 and tracked changes so you can see. We described the sensitivity analysis in terms of varying fitness of SS (0 to 0.4) but the Tables described it in terms of varying insecticide effectiveness (0.6 to 1); they are the same thing but I tried to make things consistent.

Glad you enjoyed the IVCC and Avecnet meetings.

All the best, Ian

13/6/16

Main bit to check from Ians message :

If you look at table 4 and at the text on page 12, you will see that we calculate a value phi that would generate a RR fitness of 1.0 and our sensitivity analysis is then 0.2->1 times phi; this is to prevent any RR fitness going above 1. Please, please tell me this was implemented in the sensitivity analysis!

The text on p12 says : • v) Selective advantage of the RR genotypes (the six “s” coefficients in Table 2). The SS genotype in the absence of insecticide is assumed to have the maximum possible fitness in the population and is assigned the reference fitness value of 1 (see Table 2). High values of selective advantage could enable the fitness of RR genotypes to exceed 1, so this prevented by setting the maximum value of [selective advantage] to be 1-effectiveness and Table 2 shows that [... some greek].

I find it tricky that we have 3 variable naming schemes, greek, english greek & my easier english. The greek doesn't copy & paste into here.

What actually happens in the code :

#andy 5/2/16 reducing to avoid fitness error
phi.SS1_A0 <- runif(1, min=0.45, max=1)
phi.SS2_0B <- runif(1, min=0.45, max=1)


## dominance of resistance
h.RS1_A0 <- runif(1, min=0, max=1)
h.RS2_0B <- runif(1, min=0, max=1)    
#to try to get very different (& some very low, values of dominance)
#h.RS1_A0 <- 10^-(runif(1, min=0, max=5))
#h.RS2_0B <- 10^-(runif(1, min=0, max=5))

## selective advantage of resistance
#s.RR1_A0 <- runif(1, min=0.2, max=1)

# s.RR2_0B <- runif(1, min=0.2, max=1)
# Ian suggested this should be dependent on phi to ensure fitness of RR stays below 1 # 5/2/16 i'm concerned this adds in a correlation between inputs that might fudge our ability to see what's going on # try going back to above #s.RR1_A0 <- runif(1, min=0.2, max=1) * phi.SS1_A0 #s.RR2_0B <- runif(1, min=0.2, max=1) * phi.SS2_0B s.RR1_A0 <- runif(1, min=0.1, max=0.45) s.RR2_0B <- runif(1, min=0.1, max=0.45)

So the description in the manuscript is not correct. From 5/2/16 it was changed to :

Effectiveness 0.45-1 Selection coefficient 0.1-0.45

(to avoid relationship between them which was confounding the sensitivity analysis)

I had sent Ian an email about that on 8/2/16

I sent an email to Ian to this effect, he was confused.

In the MS where it says : High values of selective advantage could enable the fitness of RR genotypes to exceed 1, so this prevented by setting the maximum value of [selective advantage] to be 1-effectiveness and Table 2 shows that [... some greek].

In the sensitivity analysis the maximum value for selection coefficient was set to be less than the minimum value.

Skype with Ian. He is concerned that we don't cover the full range of potential selection coefficients. We do 0.1-0.45 whereas it could go up to 1. He will work on describing it in the paper and then we can decide whether we need to rerun the sensitivity analysis.

Suugestion that we may want to do a 'restoration coefficient' which is the value by which we multiply effectiveness.

~ we set fitness cost of resistance to 0 in sensi analysis Ian said that was OK and we can't look at everything, a referee may ask us to look at.

fitness cost of resistance allele in no insecticide : z

I don't think we explicitly say this in the MS, I mentioned to Ian.
Described on p7 of MS Fitness costs of resistance may arise in mosquitoes which do not make contact with the insecticide. This reflects the possibility that the metabolic changes that enable IR may have deleterious effects on their normal metabolic function (e.g. [18, 19]). We therefore allow the option of fitness costs i.e. by setting z>0 in column “-“ of Table 1 and these costs may exhibit different levels of dominance quantified by the associated ‘h’ parameter; fitness costs can be easily ignored (i.e. z=0) if they are believed, or assumed, to be absent.
& p20 The approach here has been to construct a flexible model that will remove the need for each new researcher to re-derive and re-programme the model. The flexibility lies in how the niche exposures are defined and how previous analyses may be run as sub-sets of the whole model with different calibrations e.g. previous analyses which may have assumed fitness costs to be negligible, assumed complete dominance, and so on.

• v) A RR_restoration_coefficient (RRrc) that is varied between 0.2 and 1 and used to generate the sSelective advantage of the RR genotypes (the six “s” coefficients in Table 2). This approach to generating the selection coefficients is needed because the SS genotype in the absence of insecticide is assumed to have the maximum possible fitness in the population whichand is assigned the reference fitness value of 1 (see Table 2). High values of selection coefficients could enable the fitness of RR genotypes to exceed 1, so this is prevented by noting that setting the maximum value of selection coefficient that can occur and the RR fitness still be ≤1 is to be and Table 2 shows that so the selection coefficients of the RR genotypes are sampled as RRrc* ϕ. This makes intuitive sense. The insecticide effectiveness, ϕ, defines the proportion of SS genotypes killed by the insecticide while the RRrc measures the ability of the RR genotype to overcome this killing and restore insect viability. We use the RRrc values in the sensitivity analysis rather than selection coefficient as the latter is correlated with insecticide effectiveness .

Ians suggestion of using an RR_restoration_coefficient instead of a selction coefficient. Effectively it will be generated randomly (0.2-1) and then multiplied by the effectiveness to get the selection coefficient.

14/6/16 emails from Ian

Yes, I think we need to re-run it. Assuming we go for something like PLOS Computational Biology then I think its important the model be internally consistent and that we allow high selection coefficients.

Happy to extent the parameter range of effectiveness given its importance.

I’ll leave it to you to name the new input parameter. I call it RRrc in the text (page 12) but happy to change this for something snappier.

In the original description of the policies we defined a Useful Operational Lifespan (UOL) for the policies (invariable the time until resistance allele frequency reached 50% at both loci) and when I went through the discussion again it seemed much more succinct to use the term i.e. UOL. It also made a clear distinction, and avoided potential confusion, between time to resistance for one locus and time for resistance at both loci. Consequently, I have suggested some of the labels embedded in the figures need to be updated to read UOL. I noted this in their captions so it would be a good time to incorporate/discuss/reject these suggestions.

If you have time, it would also be beneficial if you spend a day working on the ms, checking for errors, suggesting ways of clarifying the text, checking for consistent terminology and so on. I have read and edited it so many times recently that I lack the mental will power to do it again in the near future and it would benefit from a fresh look. Once we have incorporated the new figures we can then send it to Beth for a (final?) check.

I said I didn't like UOL

No strong feelings on “UOL” or “time to resistance”, so long as it the text is clear an unambiguous. If we no longer use UOL we need to remove its definition from the text.

I hate quibbling about nomenclature but I think “resist_advantage_I1” can be misinterpreted. Even if its value is 1 then the (fitness) advantage may be much less because it is multiplied by its effectiveness. Also it is not clear whether resist_advantage is for the RS or RR genotype. It could also be confused with selective advantage.

I suggest we make the textual explanation (page 12) as clear as possible and design labels around that. I deliberately chose the work “restoration” as it make it obviously, at least to me , that the parameter quantifies the extent to which the RR genotype restores the mosquitoes’ viability that is lost by the SS genotype after insecticide contact.

Happy to hear suggestions for words other that “restoration”.

Here are the rr_restoration changes that I made :

RR_restoration_coefficient or rr_restoration_ins1 & rr_restoration_ins2

What I changed : search 14/6/16 to see changes

sensiAnPaperPart.r

done ~ add generation of new inputs & multiplication to give s done ~ save new inputs to input file to enable post-run analysis

setInputOneScenario.r

done ~ add new inputs as args done ~ add doc for new args done ~ add check that s & rr inputs are consistent

runModel2.r

i think doesn't need to be altered, it will just use the longer input object & output that for post-run analysis

sensiAnPaper1All.Rmd

~ copied needed bits into paper1_results_figs_rr.Rmd so not needed anymore

paper1_results_figs_rr.Rmd copied from paper1_results_figs.Rmd

done ~ change loading of model outputs done ~ replace 's*' with rr_restoration_ins1 in this line treePredictors <- c('P_1','P_2','exposure','phi.SS1_A0','phi.SS2_0B','h.RS1_A0','h.RS2_0B','s.RR1_A0','s.RR2_0B') treePredictors <- c('P_1','P_2','exposure','phi.SS1_A0','phi.SS2_0B','h.RS1_A0','h.RS2_0B','rr_restoration_ins1','rr_restoration_ins2') done ~ comment out these lines rownames(treeInput)[rownames(treeInput)=="s.RR1_A0"] <- "selection_coef_allele1" rownames(treeInput)[rownames(treeInput)=="s.RR2_0B"] <- "selection_coef_allele2" done ~ modify curtisInputs to include r rather than s
# 14/6/16 rr_restoration = s / effectiveness #"selection_coef_allele1"=0.23, #"selection_coef_allele2"=0.43, "rr_restoration_ins1"=0.23/0.73, "rr_restoration_ins2"=0.43/1,

paper1_results_figs_slimmed50_rr.Rmd

done ~ modified same as previous to create new rr polished figs

Beware that although I may pass rr_restoration_ins1 to runModel2() it will not be used, it will just go in the inputs to enable post run analysis. s.RR1_A0 will be used by the model. I put a check into setInputOneScenario.r that s=newR * effectiveness.

15/6/16 modifying to rr_restoration go with that as the variable name for now we can always change later by find/replace

#default setting of the random seed in sensiAnPaperPart() should make the input files mostly the same #the exception is the exposure array a. The single exposure input will be the same, but the array changes #according to setExposure( )

16/6/16

done ~ modify runcurtis_f2.r to accept rr_restoration inputs done ~ get a new UI with RR inputs working

coolio have it working in resistmob2 and it's good being able to explore full range of selective advantage, even setting it to 0 does what I would expect. Reducing selective advantage reduces the rise of resistance in one insecticide, which leads to it being protected by the other in a mixture.

cool contrast between these scenarios modifying effectiveness and rr_restoration

in which all defaults set to 0.5 except

AA effectiveness I1 to 0.8 causes mixture to be slower, although I1 is faster it protects I2 in mixture, making mixture slower

runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.8 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') )

BB rr advantage I1 to 0.8 sequential remains slower, although I1 is faster as in previous it provides less protection to I2 in the mixture

runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.5 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.8 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') )

CC rr advantage I2 to 0.2 sequential remains slower (just), I2 slower than I1 in mixture similar to AA but receives less protection in mixture

runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.5 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.2 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') )

done ~ fix rr_ warnings from setInputOne Scenario in paper2 done ~ finish trying to explain above scenarios #AA #BB #CC to myself, maybe put into my paper2 doc done ~ shinyapps submit done ~ send to ian

~ ran the 10000 simulations by code from paper1_results_figs_rr

17/6/2016

~ skyped with Matt, he has run the mosaics simulations and is initially seeing much lower correlations in PRCC and that the trees aren't working.

~ put the run datafiles onto github so Matt can get at, maybe in a separate repo, to avoid making resistance itself too big ? Matt has just set up dropbox. What are the new user limits ? 2GB so should be OK. particularly if I just share the 1000 runs.

~ should I do that directly from Git rather than from RStudio ? https://help.github.com/articles/create-a-repo/

GitHub will warn you when pushing files larger than 50 MB. You will not be allowed to push files larger than 100 MB.

How big are the 10,000 run results rdas ? ~ 500MB 1000 runs ~ 50 MB

~ modified and ran slimmed paper figs doc

Figures are much the same with effectiveness & exposure being key.

email from Ian :
One other thing for your “to do” list. It concerns the arbitrary line you drew through the points to decide whether or not mixtures are better i.e. Figure 8 and 9. The standard way of assessing the quality of this “diagnostic” is through “sensitivity” and “specificity” (I assume you are familiar with these, if not the internet will provide a better explanation than anything I could do) e.g.

https://en.wikipedia.org/wiki/Sensitivity_and_specificity

Can you therefore make a 2x2 table with rows the “true” result i.e. mixture better, sequential better. Columns are ~predicted” result i.e. mixture better, sequential better. I can then calculate the sensitivity and sensitivity. Actually they are such a widely used metrics that R probably has a command to draw the table and calculate the results. Given that Plos Computational Biology is a likely destination, the more calculations we can pack in, the better and it’s a small task that may significantly improve our chances of acceptance.

20/6/2016

~ added sensitivity/specificity calculation for Ian ~ removed 'mix either' from one of figs in paper1_results_figs_slimmed_50_rr Fig x3b which shows how time-to-resistance changes for diff strategies for effectiveness. The pattern that effectiveness causes which is best to change is still there, but is tricky to see. ~ added gridlines back into PRCC plots

21/6/2016

~ sent this link to Matt : http://kbroman.org/github_tutorial/ this from there is good : http://kbroman.org/github_tutorial/pages/fork.html

reading & editing the current MS

~ something weird going on. In the new figs. PRCC Fig x2. effectiveness now has a -ve correlation with time-to-resistance for mix & adaptive, where in the previous analysis it was +ve. Could this be because I have extended the range of effectivenesses we are looking at ? Fig. x3B shows this. Previously we looked at effectiveness from 0.45 to 1, in this param range effectiveness had a +ve effect on time-to-resistance in mixtures. Whereas over the larger range it has an overall -ve effect.

THIS IS KEY : see my figx3B (to replce MS fig5).

1 For sequential : 1.1 increasing effectiveness decreases time to resistance 2 For mixtures (including adaptive) 2.1 at effectiveness below 0.7, increasing effectiveness also decreases time-to-resistance 2.2 at effectiveness above 0.7, increasing effectiveness increases time-to-resistance

The PRCC doesn't pick 2.2 up because over the whole range of effectiveness (0.3 to 1) effectiveness has a negative correlation with time-to-resistance.

Thus we may want to ~ tweak the bits of the results section that refer to the PRCC fig to refer to the trend figs instead to make the story clearer ~ put the fig5 (trends) before fig4 (PRCC) in the MS.

There is still a nice story. The PRCC of each strategy shows (fig x2, 4) that effectiveness has less of a -ve effect on mixtures than sequential. The PRCC of mix-seq (fig x4, ) shows a very high positive correlation of effectiveness. So increasing effectiveness favours mixtures over sequences across the whole effectiveness range. This would be even more marked if looking just at effectiveness above 0.7).

22/6/2016

done ~ should we use strategy or policy for sequential versus mixture ? I prefer strategy. done ~ in MS change numbers in linear models done ~ in violoin plot change adaptive to adaptive both

this is the paper we should cite for benefits of open-source : best practices for scientific computing http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001745

~ check through email for our discussion about which journal to submit to. From March 11th, I wrote : I had a quick look at PLOS Comp Biol, they do offer software papers but have a word limit of 3,500 (and other constraints). The manuscript is currently 9,300. My thought was that the Malaria Journal might get this to more of our target audience ?

Other options on wheer to submit : Barbosa 2012 : Malaria journal Birget & Koella 2015 : Evolution, Medicine, and Public Health

Looking at % exclusion, see percentExcluded in figs slimmed rr Seems it can go up to ~30%. eeek

23/6/2016

checking on runs where resistance threshold not reached.

added plots.

seems that removing start freqs < 0.01 would remove most of the problem ... but would we lose part of our story then ?

also low exposure is a large contributor to failed runs

setting exposure to run from 0.25 or 0.3 rather than 0.1 would remove > half of failed runs, and I don't think it would change results markedly.

remember that dif_mix2_seq : used for PRCC & the effectiveness/exposure plot (30% excluded) ggInsOuts : used for indiv input plots (17% excluded, across I1,I2,mix,seq etc)

Can I get the 30% down for dif_mix2_seq by subsetting input ranges ??

#23/6/2016 experimenting to see if can reduce the % excluded by tweaking input ranges dif_mix2_seq2 <- dif_mix2_seq[ dif_mix2_seq$start_freq_allele1 > 0.005, ]

down to 20% exclusion

Good chat with Ian : ~ what to do with runs that don't reach resistance threshold within 500 generations Ian: make it clear that 30% of runs don't reach resistance threshold within 500 generations therefore not relevant operationally. Can say that by changing the input ranges this would be fewer. ~ text replacement for UOL Ian: yes fine to go with time to resistance ~ can I edit down 2 decision tree paragraphs to 1 ? Ian :yes ~ where to submit, A PLOS comp biol research paper is OK. Ian is aiming for impact & generality for the REF. Malaria journal is good but has low generality. Ian will canvas Hillary & David. ~ is there a reason why he isn't 1st author Ian: he didn't imagine it would take this much work. I said I would vote for him being first author if I could.

~ I expressed my concern about trickiness in the code and how this could trip us up in future (as it has already). I think to be safe and efficient we need to spend some time rationalising it. With time & resources tight this is tricky. Could IVCC help to fund (e.g. could we pitch it to them as the step to 3 insecticides & macro-mosaics).

Ian said he understood. He said once we have the paper out we can go to IVCC to ask for more money particularly to do 3 insecticides.

I said I would get edited paper to him tomorrow.

~ quick look at low effectiveness runs. Can they make it to thresholds ? yes

effectiveness2 : These produce identical results as you would expect (effectiveness1 & 2 set to 0.9 in each respectively) A: runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.5 , phi.SS2_0B = 0.9 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') ) B: runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.9 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') )

~ effectiveness2 ~~ seems that effectiveness2 doesn't quite have the humped appearance for mixtures that effectiveness1 does. Why could that be ? Effectiveness I2 does not lead to an increase in time-to-resistance above 0.7, why is that the case ? I could check in UI version ~~ in ui version effectiveness1&2 seem to have same effect ~~ therefore could it be to do with the excluded runs ? the below in paper1_results_figs_rr seems to suggest the diff remains even when most excluded runs are subsetted out

Fig. xx2 Quick check whether difference between effectiveness 1 & 2 dissapears when minimising excluded runs

~~ seems it could be due to a few points at 999 that are still in the fig and are dragging the average up ? ~~ no apparently not. ~~ it could just be due to chance, doesn't look like there are a huge number of points, and it wouldn't take many to make a difference ~~ this suggests that my line plots can be potentially misleading, maybe I should put the points back in for the paper ~~ perhaps the line plots are encouraging me to look for patterns that aren't really there ? ~~ for the paper I could do a plot of effectiveness1+2 ? yes ~~ interestingly adding the effectivenesses shows that are very high effectivenesses of both insecticides resistance arises slower for mix either than sequential !!

~ whats the best terminology for time-to-resistance ? What do other authors use ? Birget & Koella 2015 (A genetic model of the effects of ITN on the evolution of insecticide resistance) time to evolve resistance generations to reach 50% resistance (in fig4 legend) Barbosa 2012 spread of resistance rate of change of resistance

24/6/2016 editing the MS done ~ edit text on decision trees done ~ edit out UOL and replace with tim to resistance

27/6/2016

~ do some find & replace changes to text labels (& maybe to variable names)

don't use I1 I2, because it looks like L1,2 in some fonts

correctMixDeployProp to correct_mix_deploy maleExposureProp to male_exposure_prop effectiveness_insecticide1 to effectiveness_ins1 effectiveness_insecticide2 to effectiveness_ins2 rr_advantage to rr_restoration

eeek I had to add these in to correct a problem that inputs from the previous runs had different names

rownames(treeInput)[ "maleExposureProp" == rownames(treeInput) ] <- "male_exposure_prop" rownames(treeInput)[ "correctMixDeployProp" == rownames(treeInput) ] <- "correct_mix_deploy" rownames(treeInput)[ "rr_advantage_I1" == rownames(treeInput) ] <- "rr_restoration_ins1" rownames(treeInput)[ "rr_advantage_I2" == rownames(treeInput) ] <- "rr_restoration_ins2"

done ~ edit PRCC figs to remove repeated x axis labels done ~ edit text about runs not reaching resistance done ~ edit results text done ~ edit text on rr_restoration done ~ add few sentences to abstract done ~ check the SI done ~ copy & paste in new fig versions

archived my changes to : 2016_06_27 IR model manuscript plus SI.docx

28/6/16 ~ work out github advice for Matt, maybe test it first.

Yes sent see : "C:\Dropbox\R docs\git notes andy.txt"

based on these Stage 3 from here : (its a little tricky) https://help.github.com/articles/fork-a-repo/ http://kbroman.org/github_tutorial/pages/fork.html

Instructions for creating a fork of the resistance repository, being able to add your own updates, and to receive updates from me.

https://github.com/AndySouth/resistance

1) click fork will create : https://github.com/Mapowney/resistance

2) click clone to get the URL to use in git clone

from new local folder

git clone https://github.com/Mapowney/resistance.git

3) set up syncing cd resistance git remote add upstream https://github.com/AndySouth/resistance.git

to view remote repos

git remote -v

4) SYNC FORK from upstream, to get updates from me git fetch upstream git checkout master git merge upstream/master

5) COMMIT & PUSH local changes to the fork, to put your own updates on github (this step can be done from RStudio UI) git commit -a -m "message" git push

I suggest you initially put all your code in a master folder called e.g. matt/

putting figs side by side in paper2

29/6/16 briefly looking at mosaics versus mixtures

interestingly in this example where all inputs set to 0.5, reducing correct mix to 0 (i.e. a mosaic) does better than where correct mix is 1. A: runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.5 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, correct_mix_deploy = 1 , strategyLabels = c('seq','','adapt','mix2') ) B: runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.5 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, correct_mix_deploy = 0 , strategyLabels = c('seq','','adapt','mix2') )

In this one mixture does better than mosaic : A: runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.5 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, correct_mix_deploy = 1 , strategyLabels = c('seq','','adapt','mix2') ) B: runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.5 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, correct_mix_deploy = 0 , strategyLabels = c('seq','','adapt','mix2') )

My potential simple(ish) explanantion of what is going on with mosaics. A mosaic is more like a sequence than a mixture. If inputs are set so that a sequence is better, then a mosaic will be better too.

Resistance tends to increase more slowly in a mosaic than a mixture because there is less selection for resistance. However is there a chance that the reduced application of one insecticide will reduce it's protection of the other ? and thus lead to potentially faster resistance ?

but don't want to get too caught up in this right now.

30/6/16

checking on mechanisms in paper2

increasing exposure favours sequences e.g. effect1 0.8, expose 0.5 -> mix better effect1 0.8, expose 0.8 >- seq better effect1 0.5, expose 0.5 -> seq better effect1 0.5, expose 0.8 -> seq more better

thats good I have that clearer in my head now.

Here is one particular example where increasing the difference in starting frequencies leads to a switching from sequential being preferred to a mixture being preferred. I may not put this into paper2. Increasing exposure reduces the difference.

A: runcurtis_f2( max_gen=500, P_1 = 0.001 , P_2 = 0.001 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.6 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') ) B: runcurtis_f2( max_gen=500, P_1 = 0.1 , P_2 = 0.001 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.6 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') )

added starting frequencies to paper2

added vlines to resistance plots updated UIS and published : resistmob_mosaic

Justin Mcbeath (Bayer) emailed me. Video of a talk of his. https://vimeo.com/132308643

1/7/16

skype Matt about github etc.

Talking about mosaics he sad that it makes intuitive sense that effectiveness is not so important for mosaics because the 2 insecticides are not present in the same place, thus the effectiveness of one will not influence its protection of the other. But my thought was that would a micro-mosaic be expected to be much different from a mixture ? E.g. would a mosquito that has been exposed to one insecticide in one hut then be expected to be exposed to the other in another hut ? (in which case i would expect effectiveness to continue to be important). Or even that the offspring of a mosquito exposed to one insecticide could be exposed to the other one.

4/7/16

query from Ian I am worried about the new classification tree i.e. Figure 7a.

Firstly, effectiveness of insecticide 1 and 2 have the same parameterisation so I'm not sure why one should be favoured over the other as a decision criterion. It may just be luck that #2 was chosen but if the tree is that unstable then we would expect a warning.

Second, I don't understand the logic. In the last version, mixtures were favoured if either effectiveness was >~70%. The current tree suggest mixtures are un-favoured if effectiveness of #2 is less than 0.67 even if effectiveness of insecticide 1 is 100%. Also 7(B) is as before i.e. mixtures are only favoured if both effectiveness are relatively high.

first check whether we get any warnings set warning=TRUE for Fig x7 in slimmed. Didn't give any warnings. then maybe look at the unpruned tree.

What percentage of runs had no IR after 500 generations? (I remain a bit worried about runs where one strategy lasted e.g. 450 gens and the other >500; we are not specific and fudge this at the moment)

Among those where IR did spread in <500 gens, what proportion favoured mixtures and what proportion favoured sequential?

I figured I could extract the latter from the decision trees which give the number favour each strategy. They add up to 10,000. Does that mean we discounted runs where IR did not spread in <500gens and kept going until we got to 10,000???

eek does that mean I haven't excluded the non-resistance runs from the classification trees ?

the classification trees use this data to start in figs_slimmed :

resistBetterMix_ASeqBoolean <- resistPointsMix_A > resistPointsSeq

I need to check which of these are >= 500 & exclude

resistPointsMix_A$gencp0.5 resistPointsSeq$gen_cP0.5).5

Is the exclusion process the same as for the ggplot data

eek2 seems like I was presenting the trees for 25%, now corrected to 50%

treeResponses <- c( rownames(resistBetterMix_ASeq)[2], #2 is 25% rownames(resistBetterMix_ASeq20)[2] )

I sent Ian new results where runs were corrected.

From Ian : 4 Jul

I am still bemused as to why Figure 7a decision tree should have only one parameter. Any thoughts?

I have used the data on the decision tree to infer that 24% of the runs did not reach IR >50% in <500 generations. Among those that did, 44% favoured mixtures and 56% favoured sequential.

I asked Matt to look at the pruning in the decision trees.

11/7/2016

Matt will do the new linkage calc I can have

Ian said : Gould (1986) uses mean fitness as an indicator of mosquito abundance, we could look at producing an output like that.

Ian talked about macro-mosaics, where gene-flow (<1) happens every generation. Current micro-mosaics are a special case of this where gene-flow=1.

Models need to run along in parallel

6 subpops. Ian think this should be straightforward. (more so than the 3 inscecticide mixtures which require re-doing th genetic tables.)

Ian said, he will spend some time next week or after to submit paper If it goes to 3 reviewers it might take 6 weeks to come back

Ian doesn't think IVCC ready to pull out checkbook again yet, they probably get tired of people asking them for cash.

Scoping out 1) 6 subpop macro-mosaics

~ this is how the outputs are created ~ to do multiple interconnected ones I could create a list of multiple instances of this existing list of lists ~ listOut <- list( results=list(), fitness=list(), genotype=list(), input=input )

12/7/16

Ian said concentrate on insecticide resistance in my remaining 30 days, not enough time to do drug resistance and Gates has been saying they will fund when they need.

~ updated plotlinkage ~ accepted pull request from Matt with updated dprime calc

starting to look at fitness, remember it is stored here :

listOut <- list( results=list(), fitness=list(), genotype=list(), input=input )

listOut <- resistSimple(male_exposure_prop=0.1, sexLinked=1)

plotlinkage(listOut$results[[1]])

PLOS comp biol instructions http://journals.plos.org/ploscompbiol/s/submission-guidelines

~ Double spaced ~ Add line numbers (page numbers already in) ~ References in Vancouver style (http://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-references)

~ Upload a cover letter as a separate file in the online system, address the following questions: Why is this manuscript suitable for publication in PLOS Computational Biology? Why will your study inspire the other members of your field, and how will it drive research forward? ~ Title page The title, authors, and affiliations should all be included on a title page as the first page of the manuscript file. ~ Methods section can go before the Results. Please clarify the reasons for including Methods before the Results in your cover letter. ~ Abstract must not exceed 300 words.

~ Author Summary : a 150–200 word non-technical summary of the work. This text is subject to editorial change, should be written in the first-person voice, and should be distinct from the scientific abstract.

to get fitness results for one run

listOut <- resistSimple() listOut$fitness[[1]]

BUT it is not by generation : listOut$fitness[[1]] -,- a,- A,- -,b -,B a,b A,B A,b a,B SS1SS2 1 1 0 1 0 1 0 0 0 SS1RS2 1 1 0 1 1 1 0 0 1 SS1RR2 1 1 0 1 1 1 0 0 1 RS1SS2 1 1 1 1 0 1 0 1 0 RS1RS2 1 1 1 1 1 1 1 1 1 RS1RR2 1 1 1 1 1 1 1 1 1 RR1SS2 1 1 1 1 0 1 0 1 0 RR1RS2 1 1 1 1 1 1 1 1 1 RR1RR2 1 1 1 1 1 1 1 1 1

so to get mean fitness over time I would nead to multiply these by the genotype frequencies & would need to sort cis trans business + would need to have the frequencies of the different exposures (i think just a case of multiplying by the input exposure array (a) which can get by setExposure() ) , , niche2 = 0 niche1 sex 0 a A m 0.1 0 0 f 0.1 0 0

, , niche2 = b niche1 sex 0 a A m 0 0 0 f 0 0 0

, , niche2 = B niche1 sex 0 a A m 0 0 0.9 f 0 0 0.9

listOut$genotype[[1]] gen SS1SS2 SS1RS2 SS1RR2 RS1SS2 RS1RS2_cis RS1RS2_trans RS1RR2 RR1SS2 [1,] 1 9.960060e-01 1.994006e-03 9.980010e-07 1.994006e-03 1.996002e-06 1.996002e-06 1.998000e-09 9.980010e-07 [2,] 2 9.842249e-01 1.976290e-03 9.920810e-07 1.371651e-02 1.960623e-05 1.377115e-05 1.968432e-08 4.778956e-05 [3,] 3 9.105204e-01 1.890433e-03 9.812346e-07 8.534421e-02 1.484353e-04 8.859631e-05 1.540916e-07 1.999855e-03

Ian confirmed & then said that we probably need to calculate fitness across the whole population AND across just the exposed portion.

Dave:

Dave said that they are interested in rotations. Possibly annual for IRS. With maybe 2 AIs with

bednets last for 3 years

So many potential scenarios

Bednets Do we create a 2 mixture bednet

If mosaics are nearly as good as mixtures then would be much simpler

e.g. to distribute nets with different chemicals on.

When you have net & IRS is that a mixture or a mosaic (its a mixture if one fly gets exposed to )

GPIRM recomends rotations for IRS, but only reason not done yet is that no new AIs available. Once new AIs

Bayer have developed an IRS with permethrin and a new AI, they are now concerned that

Dave said they are running this side meeting at ASTMH and they would want u to be there, and they can support us to go there.

(but if Bayer offer to pay then take it from them)

3 AIs. IVCC developing 3 AIS. No-one has modelled. IVCC looking for support for their 3AI approach.

Bti used as a larvicide for mosquitoes. 5 toxins, 1 seems to be critical. Never been any field resistance to it, though it has been used loads (good paper on). Supporting arg for IVCC.

Declining doses, bednets over time the effectiveness declines. Once it has declined to a certain level it will promote resistance.

Net people say that the dose stays high. IRS may decline faster.

Ian has a BBSRC grant in to see how the effectiveness of insecticides changes over time.

Dave wants to know when mixtures are worse, how much worse are they.

Sumitomo. likely to want to say : mixtures are bad Bayer. likely to want to say : mixtures are good

Dave said ask Bayer to fund us.

Justin is based in Lyon.

IVCC happy to give letters of support for funding applications.

Justin Mcbeath new mixture through whopes neonic + pyrethroid potential for the pyrtehroid to increase rate of resistance to the new AI grenoble, jp davide & theresa, neonics in Ag

best chance of neonic remaining viable into the future (pyrethroid is their only option in term of cost & safety)

potential future role of mixtures in vector control, focusing on IRS. To discuss with PMI, NMCPS

Justin happy to support us going to ASTMH.

Justin, feeling that this topic of mixtures likely to become more important. Particularly until 2023 when the new IVCC AIs are due.

How best should mixtures fit into a resistance management program.

experimental hut trials & simple hut trials ~ 9+ countries 6-9 month trials performance against resistant & susceptible strains

next question to run

open with data being generated from trials (within constraints) online platform that will drip-feed data from trials (recognise that some people have concern contributing to something that has a commercial entity behind it, )

molly robertson from Path involved

justin will have a discussion with Dave about the meeting.

is there something that we can share with their biologists, Ian mentioned the manuscript.

Ian suggested we could ask them if they had anything

We asked them for ~10k. Justin said that given how important this is to them they will give this serious consideration. and then later I think he said something even more positive than that.

workshop < 20 people, 1.5 to 2 hours.

I said if we could have a model to use in a workshop to prompt discussion that would be good. Justin agreed.

18/7/16

Question from Matt :

I was just working on rotations and wanted your opinion on something. Off the top of my head I can’t see why rotations would be done in any other way than post simulation processing (much like the sequences). Do you agree that this may be the best way forwards? Or does something need to be written into the model for the simulations itself?

Hi Matt,

I was pleased with last week and the work we got done.

Good question about rotations. I agree that post-processing is probably the best way to go for rotations in the first instance. Although if I/we get a bit more time I think it would be safer and more effiicent to have the model running strategies directly (i.e. to have something like runModel2() accepting rules about when to change insecticides).

So for the first approach I'd suggest creating a new function starting a bit like this.

findResistancePointsRotation <- function( rotation_generations = 10, listOutI1, listOutI2, criticalPoints = c(0.1,0.25,0.5) )

We'll need to be careful in how we deal with runs where the resistrance thresholds aren't reached.

We'll probably want to create a plotting function based on plotcurtis_f2_generic() so we can plot whats going on.

If you want to start on the first function I'm happy to look at whenever you want. (and I'm also open to other suggestions of how to do, I've only thought about quite briefly).

cheers, Andy

25/7/16 email from Ian about final submission of MS

Andy and Beth. Can you check your addresses on the cover page. Beth, I guessed you are in IIB. Andy, does your consultancy have a name??

Andy. Can you help with the figures. 2D and 8A both seem low resolution and/or small font on this scale. Can we improve this? I think the RH caption of Fig 10 is incorrect. Shouldn't it be "Sequential time to resistance >80% longer". In retrospect it may be clearer to label both figures 9 and 10 the same with the LH caption being "Adaptive mixtures last longer" and the RH "Sequential use lasts longer"

Fig 2D is part of a picture so I can't edit.

I could increase res. for Fig 3.

Fig 10 captions : changed to : adaptive mix time to resistance >= 20% longer adaptive mix time to resistance <= 20% longer

improved res of fig 8 (trees) & others by increasing zoom on pdf to 150% & snapshotting into the word doc.

31/8/2016

making claim

Matts results on mosaics see Initial Micro-Mosaic Findings.pdf

~ mosaics are nearly always better than sequences (93%), and when a sequence is better it is very small. ~ when sequences are better than mixtures, mosaics are better than sequences ~ when mixtures are better than sequences,

~ high exposure and dominance makle mosaics more preferable.

I could try to make those results clearer ...

chatting to Ian :

He has been thinking about a way of representing LD across all the genotypes might not work.

new Barbosa paper too.

Ian is going to prompt IVCC for more money after sending them stuff, do they want to funs research beyond the end of the year.

Matt's money goes to end of year.

Ian has a BBSRC grant will hear in October with a 3 year postdoc, including monitoring and maintaining vector colonies so would have to be local. Matt & I are welcome to apply. But its not guaranteed to come off.

Ian wants to keep looking at mosaics, looking at simulations in parallel with gene flow.

Gates money is for insecticide resistance, rather than looking at policy advice, but if the policy questions for Bayer, Ian would be prepared to lend a couple of days money from the Gates grant.

Have a look at questions that Bayer sent.

I should keep some of the Gates time in reserve to be able to address referees comments on the 1st paper.

Ian asked me how I thought it was best to spend my time. I said that I wasn't sure yet, and I'll look into further.

5/9/2016

resistance papers : http://blogs.biomedcentral.com/bugbitten/2016/09/02/sub-lethal-effects-of-insecticide-resistance/

Interactive cost of Plasmodium infection and insecticide resistance in the malaria vector Anopheles gambiae. Alout H1, Dabiré RK2, Djogbénou LS3, Abate L1, Corbel V1,4, Chandre F1, Cohuet A1. Author information Abstract Insecticide resistance raises concerns for the control of vector-borne diseases. However, its impact on parasite transmission could be diverse when considering the ecological interactions between vector and parasite. Thus we investigated the fitness cost associated with insecticide resistance and Plasmodium falciparum infection as well as their interactive cost on Anopheles gambiae survival and fecundity. In absence of infection, we observed a cost on fecundity associated with insecticide resistance. However, survival was higher for mosquito bearing the kdr mutation and equal for those with the ace-1(R) mutation compared to their insecticide susceptible counterparts. Interestingly, Plasmodium infection reduced survival only in the insecticide resistant strains but not in the susceptible one and infection was associated with an increase in fecundity independently of the strain considered. This study provides evidence for a survival cost associated with infection by Plasmodium parasite only in mosquito selected for insecticide resistance. This suggests that the selection of insecticide resistance mutation may have disturbed the interaction between parasites and vectors, resulting in increased cost of infection. Considering the fitness cost as well as other ecological aspects of this natural mosquito-parasite combination is important to predict the epidemiological impact of insecticide resistance.

6/9/2016

looking at the selection calculation to understand what it does.

mostly it comes down to this :

fs[sex,paste0(locus1,locus2)] <- (f[sex,paste0(locus1,locus2)] * Windiv[sex,locus1,locus2]) / W.bar[sex]
genotype frequency after selection = (frequency * fitness) / ( sum of freq * fitness for this sex )

I could probably make that more concise using properties of the arrays.

oooo here there is a very nice shiny app demonstrating single locus selection : http://rosetta.ahmedmoustafa.io/selection/

i sent a twitter direct message, haven't heard back yet.

done ~ work out order and how fitness translates to gamete frequencies done ~ develop a graphic of the model done ~ write the methods section model description for paper 2. It will help me get back in.

12/9/2016

msg from Ian :

Dear Matt and Andy

How is the gene-flow/swapping progressing for the "macro" mosaics? I think the micro-mosaic work is publishable but Reviewers are bound to ask about geographically-distinct macro-mosaics.

I am still trying to think of an explanation of why mosaics are better despite generating positive LD between the resistance alleles. After several walks along beaches over the last few weeks, the best I can come up with is the following. I think it may be that this positive LD reduces the variance in fitness (increased frequencies of double heterozygotes). It's a classical result in population genetics that speed of response depends on variance in fitness so it may be worth testing this idea.

https://en.wikipedia.org/wiki/Fisher%27s_fundamental_theorem_of_natural_selection

Compared to mosaics, it may be that positive LD in mixtures is counterproductive because the fitness values tend to be so skewed towards favouring the double homozygote so positive LD increases variance in fitness in this case. If the dynamics do depend on LD and fitness variance then it may be a short cut to predicting when mixtures are favoured over sequential/mosaics.

Matt, assuming you cannot think of an alternative explanation, can you compute variance in fitness over the time course of selection, using the default values we identified previously.

I'm not best placed to suggest how you do this, but the following strategy sprang to mind (they were long beaches!). The algorithm (and I assume the code) calculates the frequency of gametes each generation, then uses these to calculate the frequencies of the diploid genotypes. You can presumably write a fairly simple function to obtain variance in finesses from these diploid frequencies. [Store these 'correct' diploid frequencies for future use]. Now generate gamete frequencies assuming LD=0 (i.e. as the product of their constituent allele frequencies). Then, as above, use the existing code to generate the frequencies of diploid genotypes and obtain the variance in fitness using the new function. Then restore the correct diploid genotype frequencies and proceed as normal.....

We need to come up with some sort of rational explanation of why mosaics tend to be better. Even if we can't come up with a convincing one, we still need to show we have tried.....

Hi Ian, Matt,

Thanks, the Fisher (speed of change being dependent on variance in fitness) is interesting, I'll have a think about it.

I've been working on fully explaining the mechanisms behind the mixtures vs sequence results and their responses to the different inputs. If I get you something by tomorrow will you be able to read before Thursday ? I could try to think about relevance to the micro-mosaics too, or we can talk about when we meet.

I haven't done anything on the macro-mosaics, my recollection was that we agreed and said to IVCC we couldn't do anything on that without more funding. I have only ~ 4 weeks left which may need to include revisions and figures for the first paper. Also I feel the code needs improving before we try to stretch it to the next task. If there isn't more funding for me then I'm only really going to be able to tie up loose ends. I'm hoping Bayer can help with this but I haven't heard anything yet and time is short.

Trying to understand why exposure & effectiveness have different effects on mixtures. Doesn't make sense to me now.

lef <- runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.8 , phi.SS2_0B = 0.8 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, correct_mix_deploy = 1 , strategyLabels = c('seq','','adapt','mix2') )

lex <- runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.8 , phi.SS1_A0 = 0.5 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, correct_mix_deploy = 1 , strategyLabels = c('seq','','adapt','mix2') )

lef$fitness[1] lex$fitness[1]

Hi Ian, Just in case you are on your summer work timetable and are still around and can answer this very easily. No worries if not.

I've been scratching my head to explain why exposure & effectiveness have different effects on mixtures, but the same effect on sequences and single insecticides.

These are the fitnesses for when all inputs set to 0.5 except for :

effectiveness 1 & 2 set to 0.8.

   -,- a,- A,- -,b -,B a,b  A,B A,b a,B

SS1SS2 1 1 0.2 1 0.2 1 0.04 0.2 0.2 SS1RS2 1 1 0.2 1 0.4 1 0.08 0.2 0.4 SS1RR2 1 1 0.2 1 0.6 1 0.12 0.2 0.6 RS1SS2 1 1 0.4 1 0.2 1 0.08 0.4 0.2 RS1RS2 1 1 0.4 1 0.4 1 0.16 0.4 0.4 RS1RR2 1 1 0.4 1 0.6 1 0.24 0.4 0.6 RR1SS2 1 1 0.6 1 0.2 1 0.12 0.6 0.2 RR1RS2 1 1 0.6 1 0.4 1 0.24 0.6 0.4 RR1RR2 1 1 0.6 1 0.6 1 0.36 0.6 0.6

exposure set to 0.8 -,- a,- A,- -,b -,B a,b A,B A,b a,B SS1SS2 1 1 0.500 1 0.500 1 0.250000 0.500 0.500 SS1RS2 1 1 0.500 1 0.625 1 0.312500 0.500 0.625 SS1RR2 1 1 0.500 1 0.750 1 0.375000 0.500 0.750 RS1SS2 1 1 0.625 1 0.500 1 0.312500 0.625 0.500 RS1RS2 1 1 0.625 1 0.625 1 0.390625 0.625 0.625 RS1RR2 1 1 0.625 1 0.750 1 0.468750 0.625 0.750 RR1SS2 1 1 0.750 1 0.500 1 0.375000 0.750 0.500 RR1RS2 1 1 0.750 1 0.625 1 0.468750 0.750 0.625 RR1RR2 1 1 0.750 1 0.750 1 0.562500 0.750 0.750

They are different. Is this how you would expect them to be ?

It may be an obvious answer.

Thanks, Andy

seems that exposure doesn't come in to the calculation of fitness ?

is it applied afterwards ? yes.

      # multiplies exposure by fitness for all niches & then sums
      # creates a weighted average of exposure in each niche
      Windiv[sex,locus1,locus2] <- sum( a[sex,,] * Wniche[locus1,locus2,,])

For exposure 0.8 :

Windiv , , locus2 = SS2

locus1 sex SS1 RS1 RR1 m 0.6 0.7 0.8 f 0.6 0.7 0.8

, , locus2 = RS2

locus1 sex SS1 RS1 RR1 m 0.6 0.7 0.8 f 0.6 0.7 0.8

, , locus2 = RR2

locus1 sex SS1 RS1 RR1 m 0.6 0.7 0.8 f 0.6 0.7 0.8

For effectivenesses 0.8 : Windiv , , locus2 = SS2

locus1 sex SS1 RS1 RR1 m 0.6 0.6 0.6 f 0.6 0.6 0.6

, , locus2 = RS2

locus1 sex SS1 RS1 RR1 m 0.7 0.7 0.7 f 0.7 0.7 0.7

, , locus2 = RR2

locus1 sex SS1 RS1 RR1 m 0.8 0.8 0.8 f 0.8 0.8 0.8

I think these could be just the same but dimensioned differently ? Is it that normalisation works differently on them ?

e.g. these are the same : exposure 0.8 : SS1SS2:0.6 SS1RR2:0.8 effectivenesses 0.8 : SS1SS2:0.6 SS1RR2:0.8

gametes G in generation1 are identical in each case : exposure 0.8 : , , locus2 = S2

locus1 sex S1 R1 m 0.9801 0.0099 f 0.9801 0.0099

, , locus2 = R2

locus1 sex S1 R1 m 0.0099 1e-04 f 0.0099 1e-04

effectivenesses 0.8 : , , locus2 = S2

locus1 sex S1 R1 m 0.9801 0.0099 f 0.9801 0.0099

, , locus2 = R2

locus1 sex S1 R1 m 0.0099 1e-04 f 0.0099 1e-04

to get at elements

G['m','R1','R2']

check W.bar & fs : identical

exposure 0.8 : W.bar m f 0.602 0.602 Browse[4]> fs loci sex SS1SS2 SS1RS2 SS1RR2 RS1SS2 RS1RS2_cis RS1RS2_trans RS1RR2 RR1SS2 RR1RS2 m 0.9574047 0.01934151 9.768439e-05 0.02256509 0.0002279302 0.0002279302 2.302326e-06 0.0001302458 2.631229e-06 f 0.9574047 0.01934151 9.768439e-05 0.02256509 0.0002279302 0.0002279302 2.302326e-06 0.0001302458 2.631229e-06 loci sex RR1RR2 m 1.328904e-08 f 1.328904e-08

effectivenesses 0.8 : W.bar m f 0.602 0.602 Browse[2]> fs loci sex SS1SS2 SS1RS2 SS1RR2 RS1SS2 RS1RS2_cis RS1RS2_trans RS1RR2 RR1SS2 RR1RS2 m 0.9574047 0.01934151 9.768439e-05 0.02256509 0.0002279302 0.0002279302 2.302326e-06 0.0001302458 2.631229e-06 f 0.9574047 0.01934151 9.768439e-05 0.02256509 0.0002279302 0.0002279302 2.302326e-06 0.0001302458 2.631229e-06 loci sex RR1RR2 m 1.328904e-08 f 1.328904e-08

check genotypes f after random mating : seem identical too

need to do this f[], i should probably rename array from f

exposure 0.8 : loci sex SS1SS2 SS1RS2 SS1RR2 RS1SS2 RS1RS2_cis RS1RS2_trans RS1RR2 RR1SS2 RR1RS2 m 0.9574073 0.01934156 9.768466e-05 0.02255979 0.0002278767 0.0002278767 2.301785e-06 0.0001328965 2.684777e-06 f 0.9574073 0.01934156 9.768466e-05 0.02255979 0.0002278767 0.0002278767 2.301785e-06 0.0001328965 2.684777e-06 loci sex RR1RR2 m 1.355948e-08 f 1.355948e-08

effectivenesses 0.8 : loci sex SS1SS2 SS1RS2 SS1RR2 RS1SS2 RS1RS2_cis RS1RS2_trans RS1RR2 RR1SS2 RR1RS2 m 0.9574073 0.01934156 9.768466e-05 0.02255979 0.0002278767 0.0002278767 2.301785e-06 0.0001328965 2.684777e-06 f 0.9574073 0.01934156 9.768466e-05 0.02255979 0.0002278767 0.0002278767 2.301785e-06 0.0001328965 2.684777e-06 loci sex RR1RR2 m 1.355948e-08 f 1.355948e-08

seems that by end of generation1 everything is the same ...

but is this just because I'm looking at the first scenario of 3 which is a single insecticide ?

lef$input a.m_00 5e-01 5e-01 5e-01 a.m_a0 0e+00 0e+00 0e+00 a.m_A0 5e-01 0e+00 0e+00 a.m_0b 0e+00 0e+00 0e+00 a.m_0B 0e+00 5e-01 0e+00 a.m_ab 0e+00 0e+00 0e+00 a.m_AB 0e+00 0e+00 5e-01

look at the results outputs for each mixture scenario

then may need to go back & browse through the mixture run by calling runModel2 directly

lef$results[3] Gen m.R1 m.R2 m.LD f.R1 f.R2 f.LD M F dprime.m r2 dprime.f [1,] 1 0.01000000 0.01000000 7.449750e-05 0.01000000 0.01000000 7.449750e-05 1 1 0.01482686 0.01482686 0.01482686 [2,] 2 0.01038778 0.01038778 8.216336e-05 0.01038778 0.01038778 8.216336e-05 1 1 0.01573650 0.01573650 0.01573650 [3,] 3 0.01079100 0.01079100 8.964014e-05 0.01079100 0.01079100 8.964014e-05 1 1 0.01652218 0.01652218 0.01652218 [4,] 4 0.01121017 0.01121017 9.726565e-05 0.01121017 0.01121017 9.726565e-05 1 1 0.01725288 0.01725288 0.01725288 [5,] 5 0.01164588 0.01164588 1.052481e-04 0.01164588 0.01164588 1.052481e-04 1 1 0.01796595 0.01796595 0.01796595 [6,] 6 0.01209874 0.01209874 1.137252e-04 0.01209874 0.01209874 1.137252e-04 1 1 0.01868184 0.01868184 0.01868184 [7,] 7 0.01256943 0.01256943 1.227971e-04 0.01256943 0.01256943 1.227971e-04 1 1 0.01941196 0.01941196 0.01941196 [8,] 8 0.01305865 0.01305865 1.325449e-04 0.01305865 0.01305865 1.325449e-04 1 1 0.02016289 0.02016289 0.02016289 [9,] 9 0.01356716 0.01356716 1.430412e-04 0.01356716 0.01356716 1.430412e-04 1 1 0.02093867 0.02093867 0.02093867 [10,] 10 0.01409571 0.01409571 1.543562e-04 0.01409571 0.01409571 1.543562e-04 1 1 0.02174195 0.02174195 0.02174195 lex$results[3] Gen m.R1 m.R2 m.LD f.R1 f.R2 f.LD M F dprime.m r2 dprime.f [1,] 1 0.01000000 0.01000000 7.449750e-05 0.01000000 0.01000000 7.449750e-05 1 1 0.01482686 0.01482686 0.01482686 [2,] 2 0.01123748 0.01123748 9.475104e-05 0.01123748 0.01123748 9.475104e-05 1 1 0.01677017 0.01677017 0.01677017 [3,] 3 0.01262660 0.01262660 1.199875e-04 0.01262660 0.01262660 1.199875e-04 1 1 0.01888719 0.01888719 0.01888719 [4,] 4 0.01418541 0.01418541 1.515796e-04 0.01418541 0.01418541 1.515796e-04 1 1 0.02122186 0.02122186 0.02122186 [5,] 5 0.01593404 0.01593404 1.912074e-04 0.01593404 0.01593404 1.912074e-04 1 1 0.02381205 0.02381205 0.02381205 [6,] 6 0.01789488 0.01789488 2.409450e-04 0.01789488 0.01789488 2.409450e-04 1 1 0.02669330 0.02669330 0.02669330 [7,] 7 0.02009277 0.02009277 3.033612e-04 0.02009277 0.02009277 3.033612e-04 1 1 0.02990097 0.02990097 0.02990097 [8,] 8 0.02255524 0.02255524 3.816395e-04 0.02255524 0.02255524 3.816395e-04 1 1 0.03347153 0.03347153 0.03347153 [9,] 9 0.02531272 0.02531272 4.797215e-04 0.02531272 0.02531272 4.797215e-04 1 1 0.03744322 0.03744322 0.03744322 [10,] 10 0.02839878 0.02839878 6.024782e-04 0.02839878 0.02839878 6.024782e-04 1 1 0.04185638 0.04185638 0.04185638

differences are visible in gen2

call runModel2 directly & work backwards in the generation loop to find the difference

thats tricky ...

or do runcurtis_f2() with max_gens set to 2 or 3, & step through scenarios until the 3rd

lef <- runcurtis_f2( max_gen=3, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.8 , phi.SS2_0B = 0.8 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, correct_mix_deploy = 1 , strategyLabels = c('seq','','adapt','mix2') )

lex <- runcurtis_f2( max_gen=3, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.8 , phi.SS1_A0 = 0.5 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, correct_mix_deploy = 1 , strategyLabels = c('seq','','adapt','mix2') )

so actually comparing Windiv, run & press continue *2 to get to the 3rd mix scenario

trimmed out f because mf same

effectiveness0.8

, , locus2 = SS2

locus1 sex SS1 RS1 RR1 m 0.52 0.54 0.56

, , locus2 = RS2

locus1 sex SS1 RS1 RR1 m 0.54 0.58 0.62

, , locus2 = RR2

locus1 sex SS1 RS1 RR1 m 0.56 0.62 0.68

exposure 0.8

, , locus2 = SS2

locus1 sex SS1 RS1 RR1 m 0.4 0.45 0.5

, , locus2 = RS2

locus1 sex SS1 RS1 RR1 m 0.45 0.5125 0.575

, , locus2 = RR2

locus1 sex SS1 RS1 RR1 m 0.5 0.575 0.65

putting them next to each other

, , locus2 = SS2

locus1 sex SS1 RS1 RR1 ef 0.52 0.54 0.56 ex 0.4 0.45 0.5

, , locus2 = RS2

locus1 sex SS1 RS1 RR1 ef 0.54 0.58 0.62 ex 0.45 0.5125 0.575

, , locus2 = RR2

locus1 sex SS1 RS1 RR1 ef 0.56 0.62 0.68 ex 0.5 0.575 0.65

remember fitnesses & exposures are different

just SS1SS2

ef 0.52 ex 0.4

what are the expected calculations for SS1SS2 ?

fitness * exposure

fitness, ef0.8 -,- A,- -,B A,B SS1SS2 1 0.2 0.2 0.04

fitness, ef0.5 -,- A,- -,B A,B SS1SS2 1 0.5 0.5 0.25

exposure, ex0.8 -,- A,- -,B A,B 0.2 0 0 0.8

exposure, ex0.5 -,- A,- -,B A,B 0.5 0 0 0.5

SS1SS2

ef8ex5 :

    -,-  A,- -,B A,B

fitness 1 0.2 0.2 0.04 exposure 0.5 0 0 0.5

expect 0.5 0 0 0.02 : 0.52

ef5ex8

    -,-  A,- -,B A,B

fitness 1 0.5 0.5 0.25 exposure 0.2 0 0 0.8

expect 0.2 0 0 0.2 : 0.4

SO THE CALCUALTIONS SEEM TO BE DOING WHAT THEY SHOULD BE

I JUST DON'T UNDERSTAND THE BIOLOGICAL MEANING ??

Two insecticides with same effectiveness reduce fitness more.

14/9/2016

Skyped with Ian about the model :

I was trying to work out: Why does effectiveness increase time-to-resistance in a mixture when exposure decreases it ?

Ian said that increasing exposure will always increase selection pressure.

Seems that I may be misunderstanding something about the model structure. I was thinking that you could potentially have 2 different exposures for each insecticide within a single niche.

BUT Ian said that exposure determines which niche an individual experiences. Thus is exposure is 0.8, 80% experience the high niche & 20% the low niche.

How then does it work with the 3 niches ?

I showed Ian the model diagrams and he said that they could be improved by taking out the 1 niche bit from the titles, & saying that the niches only apply to the selection stage.

( gametes all go in together across all niches).

Ian will read the paper & that should help.

aha! as.data.frame(Windiv) does what I want in getting an array into column format

Windiv <- fitnessIndiv() tst <- as.data.frame(Windiv) tst[1,] SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 m 0.6 0.725 0.85 0.6 0.725 0.85 0.6 0.725 0.85

Wniche <- fitnessNiche() tst2 <- as.data.frame(Wniche) tst2[1,]

Wniche is a bit more complicated, is for all niches. I just want the AB bit for now.

Wniche[,,'A','B'] locus2 locus1 SS2 RS2 RR2 SS1 0.50 0.50 0.50 RS1 0.75 0.75 0.75 RR1 1.00 1.00 1.00

as.data.frame(Wniche[,,'A','B']) SS2 RS2 RR2 SS1 0.50 0.50 0.50 RS1 0.75 0.75 0.75 RR1 1.00 1.00 1.00

as.data.frame does what I want on Windiv because it has 3 dimensions, but not on Wniche[,,'A','B'] because it just has 2

this is a hack but does what I want tst2 <- as.data.frame( aperm(Wniche[,,c('A'),c('B','0')]) )

SS2.SS1 RS2.SS1 RR2.SS1 SS2.RS1 RS2.RS1 RR2.RS1 SS2.RR1 RS2.RR1 RR2.RR1 B 0.5 0.5 0.5 0.75 0.75 0.75 1 1 1 0 0.0 0.0 0.0 0.00 0.00 0.00 0 0 0 tst2[1,]

15/9/16 on train to liverpool developing fitnessPrint()

fitnessPrint(effectiveness=0.8, exposure=0.5, insecticideUsed='insecticide1') SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 niche 0.2 0.45 0.7 0.2 0.45 0.7 0.2 0.45 0.7 SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 ind_m 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 fitnessPrint(effectiveness=0.8, exposure=0.5, insecticideUsed='insecticide2') SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 niche 0.2 0.45 0.7 0.2 0.45 0.7 0.2 0.45 0.7 SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 ind_m 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 fitnessPrint(effectiveness=0.8, exposure=0.5, insecticideUsed='mixture') SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 niche 0.2 0.45 0.7 0.2 0.45 0.7 0.2 0.45 0.7 SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 ind_m 0.6 0.725 0.85 0.6 0.725 0.85 0.6 0.725 0.85

eeek this suggests something weird going on with individual fitnesses for insecticide 1 & 2. they are all set to just the exposure value.

setExposure(insecticideUsed='insecticide1', exposure=0.8) seems OK

as.data.frame(setExposure()) 0.0 a.0 A.0 0.b a.b A.b 0.B a.B A.B m 0.1 0 0 0 0 0 0 0 0.9 f 0.1 0 0 0 0 0 0 0 0.9

Is it perhaps just something weird about when exposure is 0.5 ?

fitnessPrint(effectiveness=0.8, exposure=0.5, insecticideUsed='insecticide1') SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 niche 0.2 0.45 0.7 0.2 0.45 0.7 0.2 0.45 0.7 0.0 a.0 A.0 0.b a.b A.b 0.B a.B A.B m 0.5 0 0.5 0 0 0 0 0 0 f 0.5 0 0.5 0 0 0 0 0 0 SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 ind_m 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5

no even setting it to 0.6 does the same ? it shouldn't matter that both effectivenesses are the same because we are just concerned with 1 insecticide ...

but doesn't seem to happen in runModel2() seems it may be a problem with fitnessPrint()

something about how fitnessPrint calls fitnessIndiv() is different from how runModel2() calls it

aha no problem is with calc of Wniche in fitnessNiche() maybe it is because of the niche toggle in niche[]

yes in fitnessPrint() this did make a big difference. # maybe set all toggles on if nothing has been passed ? niche[] <- 1

WTF ?

fitnessPrint(effectiveness=0.8, exposure=0.5, insecticideUsed='insecticide2') SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 m 1 1 1 1 1 1 1 1 1 SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 niche 0.2 0.45 0.7 0.2 0.45 0.7 0.2 0.45 0.7 0.0 a.0 A.0 0.b a.b A.b 0.B a.B A.B m 0.5 0 0 0 0 0 0.5 0 0 SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 ind_m 1 1 1 1 1 1 1 1 1

in fitnessNiche, problem with Wloci : Wloci exposure loci no lo hi SS1 1 1 0.20 RS1 1 1 0.45 RR1 1 1 0.70 SS2 1 1 1.00 RS2 1 1 1.00 RR2 1 1 1.00

seems to be a problem with calling of fitnessSingleLocus() ?

i think it may be this : phi <- createArray2(locusNum=c(1,2), exposure=c('no','lo','hi')) phi[1, 'hi'] <- effectiveness

should be phi[,'hi'] <- effectiveness

something still not right with locus fitnesses : exposure loci no lo hi SS1 1 1 0.20 RS1 1 1 0.45 RR1 1 1 0.70 SS2 1 1 0.20 RS2 1 1 0.20 RR2 1 1 0.20

In liverpool :

Ian said that what Matt needs to do is to calculate the change in variance in fitness over time. He can get that from the indiv fitness that I've just calculated, and the change in genotype frequency over time.

Hill-Robinson effect : negative linkage disequilibrium, Ian doesn't quite get it

Macromosaics : How best to do ? Initially just need 2 popns, but might want more. Specify immigration & emigration between.

Does it need to be in or out of runModel2 ? Gene flow every time step. Will need to make sure that this potentially allows for 3 loci.

Skype call with Bayer went really well, they loved the UI and how it helped with discussion and to investigate scenarios.

Bayer press release from Oct 2015 on their mixture IRS : http://www.press.bayer.com/baynews/baynews.nsf/id/Bayer-develops-the-first-two-way-insecticide-mixture-for-indoor-residual-spraying

With Ian we talked about the arithmetic of how effectiveness can have a different effect to exposure. Ian started on paper & showed how for 2 insecticides the exp & eff have identical effect. He started on paper on the mixture case, and said he would try to get it done. I have the initial paper calculations so I can look at in combination with the Maynard Smith book.

Ian pointed out that the SS & SR genotypes will be most common early in the simulations in the critical period when resistance is rising. (Can I offer an option to add SS,SR & RR to the present allele frequency plots ? Indeed might not need SS. Is theer an existing func to do this ?)

20/9/16

~ can I create a plot of frequency of SS,SR,RR ? see Beths plothaplotype().

input <- setInputOneScenario() listOut <- runModel2(input) plothaplotype( listOut$genotype[[1]] )

but this is just for single strategy.

can I add option to plotallele.freq.andy() to add SR & RR freqs to curtis type plots ? for each strategy ? It's likely to get messy. Perhaps I just need to add as unlabbeled pale grey lines because the reader can work out (with help) what they are. SR will go up first to a peak & decline, RR will approach the overall frequency line.

new code can't go in the func below because it needs to access listOut$genotype plotalleleLinesOneScenario()

listOut$results[[1]] Gen m.R1 m.R2 m.LD f.R1 f.R2 f.LD M F [1,] 1 0.001000000 0.001000000 7.494997e-07 0.001000000 0.001000000 7.494997e-07 1 1

listOut$genotype[[1]] gen SS1SS2 SS1RS2 SS1RR2 RS1SS2 RS1RS2_cis RS1RS2_trans [1,] 1 9.960060e-01 1.994006e-03 9.980010e-07 1.994006e-03 1.996002e-06 1.996002e-06

Beth gets freqs out like this : for (k in 1:nrow(mat)){
## haplotype at locus 1 m.SS1 <- sum( mat[k,2], mat[k,3], mat[k,4] ) #SS1SS2, SS1SR2, SS1RR2 m.RS1 <- sum( mat[k,5] + mat[k,6] + mat[k,7] + mat[k,8] ) m.RR1 <- sum( mat[k,9] + mat[k,10] + mat[k,11] ) ## haplotype at locus 2 m.SS2 <- sum( mat[k,2] + mat[k,5] + mat[k,9] ) m.RS2 <- sum( mat[k,3] + mat[k,6] + mat[k,7] + mat[k,10] ) m.RR2 <- sum( mat[k,4] + mat[k,8] + mat[k,11] )

but I could do by querying the column names to check which contain SS1 etc. have I done anything like this in the code already ? seems not

namesLoci <- c('SS1SS2','SS1RS2','SS1RR2', 'RS1SS2','RS1RS2cis','RS1RS2trans','RS1RR2', 'RR1SS2','RR1RS2','RR1RR2')

Cool these do what I want to get indices : grep('SS1',namesLoci) grep('RR1',namesLoci)

' @param add_haplotype whether to add haplotype (freqs of SR & RR to plots)

add_haplotype = TRUE

if (add_haplotype) { #uses $genotype

# probably write a plotSSRS() function or similar
# first get it working on the case I want then see if I can extend

names_genotypes <- colnames(listOut$genotype[[1]])

RS1 <- rowSums( listOut$genotype[[1]][, grep('RS1',names_genotypes)] )
RR1 <- rowSums( listOut$genotype[[1]][, grep('RR1',names_genotypes)] )
RS2 <- rowSums( listOut$genotype[[1]][, grep('RS2',names_genotypes)] )
RR2 <- rowSums( listOut$genotype[[1]][, grep('RR2',names_genotypes)] )

}

aha! remember I want it in plotcurtis_f2_generic() rather than in plotallele.freq.andy()

and the problem is that it accepts just the $results bit of the output ... plotcurtis_f2_generic( listOut$results[[3]]

so my option is : 1) to optionally pass listOut$genotype[[3]], listOut$genotype[[2]], listOut$genotype[[1]]

2) write a new function that accepts the genotype bits, and adds them to an existing plot

1) is probably better, but it's a bit ugly to be passing more args ..., if they are optional, not so bad.

maybe I don't need to do it anyway ? Will it really help with understanding ?

26/9/16 looking at how macromosaics could work & tidying. see below. done ~ move the linkage disequilibrium calc into a function : linkage_calc()

26/9/16 tidying runModel2() as prep for macromosaics ~ deleted calibration == 104 because it didn't do anything

ag did my old problem of 'ammend previous commit' after I'd pushed

To git@github.com:AndySouth/resistance.git ! [rejected] master -> master (non-fast-forward) error: failed to push some refs to 'git@github.com:AndySouth/resistance.git' hint: Updates were rejected because the tip of your current branch is behind hint: its remote counterpart. Integrate the remote changes (e.g. hint: 'git pull ...') before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details.

RStudio, Pull gives 0 [main] sh.exe" 6308 sync_with_child: child 5216(0x1AC) died before initialization with status code 0x1 45194 [main] sh.exe" 6308 sync_with_child: *** child state waiting for longjmp C:\Program Files (x86)\Git/libexec/git-core\git-pull: fork: Resource temporarily unavailable

from ohshitgit.com git reflog git reset HEAD@{0} then a bit of pushing & pulling worked

~ changed genotype outputs to mean of m&f genotype[k,2:11] <- sum( 0.5*(f['f',] + f['m',]))

~ genotype frequency outputs (SS1SS2 etc. are now mean of m&f where they were m only, I don't think this effects any outputs we've used in the paper or elsewhere)

done ~ put selection into a function done ~ put fitness outputs into a function fitnessOutput(Wniche)

weds 28/9/16 ~ started adding a couple of simple unit-tests of some of the helper functions

thurs 29/9/16

? what to call the array f, before it causes us any problems !! fgenotypes

do I need a separet genotypes_s for after selection ? seems not ...

beware that we already have genotype[] as part of the output lists ...

genotype[gen,] is a matrix of summarised gen frequencies per generation fgenotypes[] will be an array of genotype fgrequencies in this generation

~ created resistance_freq_count() function ~ renamed a to expos, i to scen_num, moved code into setExposureFromInput() ~ renamed k to gen_num

Sent msg to Ian & Matt Hi Ian, Matt,

I trust you are both well.

An update on what I've been up to and a paper I came across that you might like to read.

I've been tidying the code, and putting more of it into functions so that it will make it easier to implement macro-mosaics & 3+ loci. Now the main function running the model is down to < 250 lines (from >1500 when I got it from Beth).

Macro-mosaics is possible to implement, maybe a weeks work for an initial version with just 2 popns. But, we might want to think about re-organising some things before we did that (e.g. to allow us to represent sequences within a single run rather than having to post-process).

3+ loci is a bigger job, it would need a restructuring of much of the present code, and again it would be good to think about whether to try to extend the current code or to re-design from the bottom up.

The changes I've made shouldn't alter the outputs. Matt can you double check that outputs are looking the same ? (after updating your version with the changes from my repository using the instructions we went over before).

Matt, I haven't been able to find your mosaic code, have you tried to commit it to your repository yet ? When we were in Liverpool I tried to suggest it's a good idea regularly to commit what you are doing. This would help you have a backup and for us all in the project to be able to see what stage we are all at. If you need any questions answered on doing this, I'm happy to answer. e.g. you can see on my page the frequency I commit stuff : https://github.com/AndySouth/resistance/commits/master

The attached paper has some recommendations about good-enough software practice and you'll see in section 5 that some of these issues are mentioned.

All the best, Andy

Fri 30/9/16

~ rename arrays to a_* s to a_sel

~ rename : Wloci : a_fitloc : fitness by locus Wniche : a_fitnic : fitness by niche Windiv : a_fitgen : fitness by genotype

~ rename fgenotypes to a_gtypes ~ rename expos to a_expos

~ I chose not to put a_cost into a_sel yet. It could cause confusion. The cost is subtracted where the selection coefficients are added.

~ try to recreate Ians diagrams in the ppt he gave to IVCC. Hastings resistance May 2016.pptx ~ cool starting to get simpfitvis working ...

3/10/2016

~ Why does fitnessSingleLoc() have different effectivenesses per no,lo,hi exposure ? So far I think we just use for the hi exposure niche : a_effect <- createArray2(locusNum=c(1,2), exposure=c('no','lo','hi')) a_effect[1, 'hi'] <- effectiveness1 a_effect[2, 'hi'] <- effectiveness2

There is the potential to set effectiveness, selection co and dominance to be different in the lo exposure niche, it's just that we don't do it yet :

a_effect[1,'lo'] <- input[26,scen_num]
a_effect[1,'hi'] <- input[27,scen_num]
a_effect[2,'lo'] <- input[28,scen_num]
a_effect[2,'hi'] <- input[29,scen_num]

# dominance = dominance coefficient
a_dom[1,'no'] <- input[32,scen_num]
a_dom[1,'lo'] <- input[33,scen_num]
a_dom[1,'hi'] <- input[34,scen_num]
a_dom[2,'no'] <- input[35,scen_num]
a_dom[2,'lo'] <- input[36,scen_num]
a_dom[2,'hi'] <- input[37,scen_num]

# a_sel = selection coefficient
a_sel[1,'lo'] <- input[38,scen_num]
a_sel[1,'hi'] <- input[39,scen_num]
a_sel[2,'lo'] <- input[40,scen_num]
a_sel[2,'hi'] <- input[41,scen_num]

This is helpful for understanding whats happening with the different fitness calcs.

dimnames(fitnessSingleLocus())

$loci [1] "SS1" "RS1" "RR1" "SS2" "RS2" "RR2" $exposure [1] "no" "lo" "hi"

dimnames(fitnessNiche())

$locus1 [1] "SS1" "RS1" "RR1" $locus2 [1] "SS2" "RS2" "RR2" $niche1 [1] "0" "a" "A" $niche2 [1] "0" "b" "B"

dimnames(fitnessGenotype())

$sex [1] "m" "f" $locus1 [1] "SS1" "RS1" "RR1" $locus2 [1] "SS2" "RS2" "RR2"

Aha! I think where I've been getting a little confused is that the exposure input determines what proportion of the population are in the 'hi' exposure niche versus the 'no' niche. (Whereas I was thinking that there was the potential to have different exposure values within each niche). If that had of been the case what would have determined the proportion of the different niches.

All of the work that we've done so far is on a 4 niche situation. Maybe I should make that clearer in paper2 ?

~ try producing some flexible fitness vis outputs

df_fit1 <- as.data.frame(fitnessSingleLocus()) ggplot(df_fit1, aes(x=1, y=hi) ) + geom_point()

library(ggrepel) ggplot(df_fit1, aes(x=1, y=hi, label=rownames(df_fit1) )) + ylim(0,1) + geom_point() + #geom_text(hjust = 0) #, nudge_x = 0.05) geom_text_repel(nudge_x=0.1)

transpose to get in a similar format to earlier

df_fit3 <- as.data.frame(t(as.data.frame(fitnessGenotype())))

ggplot(df_fit3, aes(x=1, y=f, label=rownames(df_fit3) )) + ylim(0,1) + geom_point() + #geom_text(hjust = 0) #, nudge_x = 0.05) geom_text_repel(nudge_x=0.1)

Tue 4/10/16

~ Q in fitnessSingleLocus() why is there just a single dominance ? just a shortcut, fixing it now, separate dominances are used in the main analysis.

~ plot_fit_rs() fitvis function working well ~ points coloured by R & S ? ~ generic to work on any dataframe with rownames of S&R

wed 5/10/16

emailed Justin

skyped with Ian about : ~ any news on review or grant ? ~ future : possibility of a 2:3 day split, it is a possibility ~~ Ian suggested the BBSRC grant would be better for me in publication terms, it's more of a conventional research job. Beth could apply for it. I think I would rather work separately for M&I. I said M very ambitious. Also potential for open malaria resistance modelling work funded by Gates. ~ arithmetic for 2 locus fitness effect of effectiveness & exposure. I forgot to ask about this one. ~ Matts work, how he getting on, Ian hasn't heard from him. He was off on holiday ? Might that explain not hearing since last week. ~ ASTMH booking, we'll coordinate next week

~ running package check on resistance ~ fixed things to pass R CMD check

7/10/16

~ looking at setExposure for diff insecticides ~ TRICKY BEWARE ~ allow exposure to be set separately for I1 & I2 (maybe retain ability to set a single value with 'exposure') ~~ start with setExposure(), exp1 & exp2, set to NULL by default ~~ later may also need to allow male_exposure_prop & correct_mix_deploy to be set for each insecticide

10/10/16

trying to diagnose Matts problems with github pushing (2.5 hours)

We changed to ssh which changed the previous error. Generated a new ssh key, and made sure the origin was Matts repo not mine.

Now getting an error about public key not working. ssh_askpass: exec(/usr/X11R6/bin/ssh-askpass): No such file or directory Permission denied (publickey). fatal: Could not read from remote repository.

These instructions from here seem to suuggest a solution : I emailed them to Matt.

https://support.rstudio.com/hc/en-us/community/posts/200660237-Using-Git-with-password-authentication-on-OS-X

Since OS X El Capitan you have to install ssh-askpass yourself and in the right location (indicated by the error message above).

Simple solution:

Download ssh-askpass from the link mentiond by Ian Pylvainen in his comment: https://github.com/markcarver/mac-ssh-askpass
Copy the file into a local folder you have user rights
Create a symbolic link in the place where OS X is looking for the file: sudo ln -s /ssh-askpass /usr/X11R6/bin/ssh-askpass

or just run the INSTALL script

that didn't work either. I suggested trying to write a file to the usr folder, failed.

so suggested Matt rebooted following these instructions that I had seen somewehere else too.

reboot your mac and press cmd+r when booting up. Then go into utilities > terminal and type the following commands:

csrutil disable reboot

we did that & it still had a problem. Matt created the folder that didn't exist & then the command to create the symbolic link did work.

Matts work is in InteractionPlotsMosaics.R & sensiAnMosaicInteractions.R

11/10/16

looking at Matts stuff

bugs in sensiAnMosaicInteractions.R

A: sensiAnMosaicInteractions <- function( ... propwithboth=propboth ) B: correct_mix_deploy <- propboth

Hi Matt,

This is a good start. I would like to see some more detail about what is going on in the results from the simulations. The plots that you show are good but a lot could be hidden in the clouds of points. Which other inputs are influencing the magnitude of difference between sequence and mosaic ?

I'd also like to see you create a reproducible Rmd file of the analysis as detailed below to address potential problems. It's good that we now have the github sharing working again, keeping up committing what you are doing can help avoid problems.

If you have any questions, I'm happy to answer.

Andy

You created 2 files.

InteractionPlotsMosaics.R : a script to create the plots sensiAnMosaicInteractions.R : a function helping with the creation of the plots

I suggested before that you convert InteractionPlotsMosaics.R to an Rmd (rmarkdown) file. This will allow you to put figures directly into a document and make your workflow more efficient. There are very good beginners resources here : http://rmarkdown.rstudio.com/lesson-1.html. And you can see lots of examples of how I have used it in the documents folder of the resistance package (e.g. testfigs.Rmd is a very short one I created to look at a problem I had with figure positioning on the page).

As a part of this you should aim to make each document reproducible (i.e. I or anyone else should be able to run it to repeat the analysis).

For example in your script (InteractionPlotsMosaics.R), there are a couple of lines :

Find resistance points of previously generated sequential data

resistPointsI1 <- findResistancePoints(listOutI1, locus=1)

If I try to run this script R says 'listOutI1' does not exist. I don't know how that object was generated. To make it reproducible would just require that you copy in the code that you used to generate the object.

I noticed that there are a couple of bugs in your sensiAnMosaicInteractions.R that could also cause problems with the analysis, although it may have been OK for yours. I'll try explain it to help your understanding, It gets a bit complicated, feel free to ask any questions if I fail.

In sensiAnMosaicInteractions.R you have :

Line 26: sensiAnMosaicInteractions <- function( nScenarios = 10, ... propwithboth=propboth)

Line 90: correct_mix_deploy <- propboth

Line 26: In the function definition any values assigned to inputs are default values that are used if a value is not supplied when the function is called. Thus nScenarios would be set to 10. If no value was supplied at the call R would try to set propwithboth to propboth. In a new R session propboth has no value and this would fail.

Line 90: correct_mix_deploy <- propboth At line 26 you set propwithboth in the function. But here you have used propboth. R would look to see if you have a value for this in your current session and use it if you do or fail if you don't. This also contributes to the analysis not being reproducible.

Actually I notice that in your script you call sensiAnMosaicInteractions(...propboth=*) which when I run it gives : Error in sensiAnMosaicInteractions(nScenarios, insecticideUsed = "mosaic", : unused argument (propboth = 0.1)

The solution is actually very simple. Line 26 change : propwithboth=propboth TO propboth=1 This will set the default value to 1.

start using tidyverse : install.packages("tidyverse")

because I should start using tidyr::gather instead of melt

~ working on exposure UI : expovis

12/10/16

~ why doesn't cost of resistance seem to affect single loc fitness ? It's because cost only effects fitness of RR in the 'no' exposure niche. Is this how we want it to work ???

fitnessSingleLocus(cost1 = 1, plot = TRUE) exposure loci no lo hi SS1 1 1 0.500 RS1 1 1 0.625 RR1 0 1 0.750 SS2 1 1 0.500 RS2 1 1 0.625 RR2 1 1 0.750

done ~ add ability to see fit in no insecticide to fitvis

13/10/16

~ uploaded fitvis & expovis & send to Ian & Matt ~ add ability to set exposures separately to resistmob_mosaic

wehay! PLOS comp biol paper has been accepted (nearly)

From: em.pcompbiol.0.4e821a.369efd4a@editorialmanager.com [mailto:em.pcompbiol.0.4e821a.369efd4a@editorialmanager.com] On Behalf Of PLOS Computational Biology Sent: 13 October 2016 14:24 To: Ian Hastings Ian.Hastings@lstmed.ac.uk Subject: Decision on your article submitted to PLOS Computational Biology (PCOMPBIOL-D-16-01210) - [EMID:f9886403228b3245]

Dear Dr Hastings,

Thank you very much for submitting your manuscript, 'A two-locus model of the evolution of insecticide resistance to inform and optimise public health insecticide deployment strategies.', to PLOS Computational Biology. As with all papers submitted to the journal, yours was fully evaluated by the PLOS Computational Biology editorial team, and in this case, by independent peer reviewers. The reviewers appreciated the attention to an important topic and the valuable contribution of the paper. Referee 2 identified however some aspects of the manuscript that should be addressed to improve the clarity of the methods and results, and in particular the figures.

We would therefore like to ask you to modify the manuscript according to the reviewer recommendations before we can consider your manuscript for final acceptance. Your revisions should address the specific points made by reviewer 2 concerning the clarity and length of the figures and text. Regarding the figures, please consider any issues with copyrights that would require replacement or modification.

In addition, when you are ready to resubmit, please be prepared to provide the following: (1) A detailed list of your responses to the review comments and the changes you have made in the manuscript. We require a file of this nature before your manuscript is passed back to the editors. (2) A copy of your manuscript with the changes highlighted (encouraged). We encourage authors, if possible to show clearly where changes have been made to their manuscript e.g. by highlighting text. (3) PLOS offers a figure-checking tool, PACE (http://pace.apexcovantage.com/), to help authors to ensure all figures meet PLOS requirements so that the quality of published figures is as high as possible. Please use this tool to help you format your figures. PACE is a digital diagnostic and conversion tool for figure files. It will provide information about any failed check(s) and, if able, will automatically convert the figure file into an acceptable file that passes quality checks. PACE requires you to register for an account to ensure your figure files are processed securely. (4) A striking still image to accompany your article (optional). If the image is judged to be suitable by the editors, it may be featured on our website and might be chosen as the issue image for that month. These square, high-quality images should be accompanied by a short caption. Please note as well that there should be no copyright restrictions on the use of the image, so that it can be published under the Open-Access license and be subject only to appropriate attribution.

Before you resubmit your manuscript, please consult our Submission Checklist to ensure your manuscript is formatted correctly for PLOS Computational Biology: http://www.ploscompbiol.org/static/checklist.action. Some key points to remember are:

Figures uploaded separately as TIFF or EPS files (if you wish, your figures may remain in your main manuscript file in addition).
Supporting Information uploaded as separate files, titled 'Dataset', 'Figure', 'Table', 'Text', 'Protocol', 'Audio', or 'Video'.
Funding information in the 'Financial Disclosure' box in the online system.

We hope to receive your revised manuscript within the next 30 days.

Reviewer's Responses to Questions

Comments to the Authors: Please note here if the review is uploaded as an attachment.

Reviewer #1: This is a very useful paper and provides both very useful general guidance and a framework for more detailed study of one of the most important and under-researched problems in resistance management, and has done much to sort out confusion from Curtis' paper. My apologies to both the authors and editors for taking so long with it, but amid other interruptions, it took me some time to get through the complexity. On personal preferences, I would have changed emphasis on topics covered, especially mosaics, but in the end, I think the authors are correct to stop here, and explore this in future papers.

There are a number of minor typos in the paper, which is understandable in such a long and detailed paper, and the authors are likely to catch on review, but examples are line 487 (analyses or analysis?), line 551 (organism or organisms?, punctuation in lines 805 and 809, and the symbol guide in Fig.11 has 3 solid but only one dashed line.

Reviewer #2: The manuscript develops a framework to determine whether mixtures or the sequential use of insecticide is better at delaying the spread of resistance. The work is important and timely for reasons the authors outline and the manuscript is generally well written and is an interesting read. The methods are generally clear and discussion is extensive and relevant. I found some of the figures and results hard to interpret, requiring further reading of references cited. The paper is already very long and though this isn’t a problem (and is indeed necessary given the complexity of the model) I think the manuscript would benefit from improved figure legends and abstracts (as some readers may not make it through the detailed methods and results). The online tool of the model is useful.

Major points

(1) With the exception of Figure 1 all other figures need greater detail in the caption. Ideally figures should stand alone and in long papers this is especially important. .

(1.1) It took me a while to get my head round all panels in Figure 2 and similar later versions. A more detailed explanation of what the lines meant and what it shows would help the reader grasp the concepts quicker.

1.1.1. In figure 2D why does the switch to DDT happen after the dashed line has gone past 50% (on a logged figure this could be substantially after).

1.1.2. Figure 3. What does mix2 and seq on the top of the figure mean? Similar notation is seen on figure 11.

1.2. A more detailed explanation of what PRCC does would help the reader interpret the text and Figures 5 and 7. A sentence like “ values above the dashed line indicate…..” would help.

1.3. Equally with Figures 9 and 10. After reading the text a few times I got it but explanation in the caption would be helpful.

In Line 743 it clearly states that developing an insecticide in a mixture always delays the spread of resistance to this insecticide. This is an important point and might get missed in the abstract by the less informed reader. The public health penalty of sequential use (as mentioned by the authors) could also be mentioned if space allowed.
In the discussion the frailties of using a relative measure for the time till resistance (generation time) could be discussed. The danger is that people look at column 1 in figure 4, see on average (mean) approximately 50 generations to resistance and then over interpret and say resistance will take 50/12 years. A simple sentence would reduce the chance of this happening.

Minor points

Typo Line 172 – Captial L in Lines.
Typo Line 276 and onwards, plos comp boil call it Supporting Information and not Supplementary.
Line 202 – talk about retreating bednets though I am unsure of how common this is now.
Line 235 – should it be “whose fitness is denoted 1 IN THE ABSENCE OF INSECTICIDE”?
HCH and DDT need to be defined initially.
Typo Line 463 “lifespan” not “Lifespan”?
Caption of Table 3 could be more informative
The “rr restoration coefficient” could do with more explanation including why the impact is predicted to be what it is.
Typo Line 527 “and is”
Line 630 “2.5 fold” instead of “2.5x”.
Equations 7 and 8 could go in enhanced figure legends.
Typo Line 805 “data.”

A couple of points that I can address : Rev1 : the symbol guide in Fig.11 has 3 solid but only one dashed line

A This was because the final 2 solid lines indicate which colours refer to insecticides 1 & 2. I've changed the final two lines to squares to reduce any confusion.

Rev2 : 1.1.1. In figure 2D why does the switch to DDT happen after the dashed line has gone past 50% (on a logged figure this could be substantially after).

A The switch to the second insecticide is triggered one generation after the threshold of 50% is crossed. This is because the model operates on a discrete generational timesteps. Thus there is the potential for the level of resistance to 'overshoot' the threshold particularly when the rate of increase in resistance is high.

1.1.2. Figure 3. What does mix2 and seq on the top of the figure mean? Similar notation is seen on figure 11.

A mix2 indicates the generation at which the resistance threshold is reached for the second insecticide in a mixture, and seq indicates when the threshold is reached for the second insecticide in a sequence. We will add an explanation of this to the figure legends.

A [I could change these to just mix & seq, and actually it seems seq in Fig 3 isn't in the right place contributing to the reviewers confusion. Ian you generated Fig 3. from one of my web interfaces, if you can remind me of the input values next week I can check it.]

Can I regenerate Fig 3 ? (which I think Ian generated from one of my web UIs). I should before final submission create a slimmed doc with all the figs in the paper in it.

This is the Fig3 legend : Fig 3. Recreating example vi of Curtis as given in his Table 1. Curtis reported results for only the first generation of selection so this plot shows the subsequent, multi-generational dynamics of resistance spread assuming, as he did, that the first insecticide deployed in a sequence become non-beneficial, and is switched, when alleles encoding resistance to the insecticide reach 50%.

As a first step change the symbols in the legend for the 2 coloured lines ? plotcurtis_f2_generic()

So increasing the exposure to just one insecticide from 0.5 to 0.8 decreases both the time-to-resistance to the first insecticide and the 2nd in the mixture.

runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exp1 = 0.8 , exp2 = 0.5 , phi.SS1_A0 = 0.5 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, correct_mix_deploy = 1 , strategyLabels = c('seq','','adapt','mix2') )

Whereas when effectiveness is increased to 0.8 the time-to-resistance to the 2nd insecticide in the mixture is increased.

Finally try to get to the bottom of why this is ...

My fitvis UI may help.

These show the fitnesses for females under different inputs :

all 0.5

as.data.frame(fitnessGenotype(plot=TRUE))[2,] SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 f 0.625 0.65625 0.6875 0.65625 0.6953125 0.734375 0.6875 0.734375 0.78125

exposure1 0.8

as.data.frame(fitnessGenotype(exp1=0.8, exp2=0.5, plot=TRUE))[2,] SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 f 0.475 0.54375 0.6125 0.50625 0.5828125 0.659375 0.5375 0.621875 0.70625

effectiveness1 0.8

as.data.frame(fitnessGenotype(exp1=0.5, exp2=0.5, eff1=0.8, plot=TRUE))[2,] SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 f 0.55 0.6 0.65 0.5625 0.625 0.6875 0.575 0.65 0.725

Can I check out the ratios between the genotypes within them ? Just divide by the maximum to normalise.

df_ef_ <- as.data.frame(fitnessGenotype(exp1=0.5, exp2=0.5, eff1=0.8, plot=TRUE))[2,] df_ef <- df_ef_/max(df_ef_) df_ef SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 f 0.7586207 0.8275862 0.8965517 0.7758621 0.862069 0.9482759 0.7931034 0.8965517 1

df_ex_ <- as.data.frame(fitnessGenotype(exp1=0.8, exp2=0.5, plot=TRUE))[2,] df_ex <- df_ex_/max(df_ex_) df_ex SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 f 0.6725664 0.7699115 0.8672566 0.7168142 0.8252212 0.9336283 0.7610619 0.880531 1

df_base_ <- as.data.frame(fitnessGenotype(exp1=0.5, exp2=0.5, plot=TRUE))[2,] df_base <- df_base_/max(df_base_)

to allow me to plot them

normalised ones

df_base <- data.frame(t(df_base),row.names=row.names(t(df_base))) df_ef <- data.frame(t(df_ef),row.names=row.names(t(df_ef))) df_ex <- data.frame(t(df_ex),row.names=row.names(t(df_ex)))

not normalised

df_base_ <- data.frame(t(df_base_),row.names=row.names(t(df_base_))) df_ef_ <- data.frame(t(df_ef_),row.names=row.names(t(df_ef_))) df_ex_ <- data.frame(t(df_ex_),row.names=row.names(t(df_ex_)))

x11() plot_fit_rs(df_base, column='f', title='base') x11() plot_fit_rs(df_ef, column='f', title='effectiveness') x11() plot_fit_rs(df_ex, column='f', title='exposure')

Seems that in the case where effectiveness is increased the fitnesses are closer together so does that mean that there is less selection for resistance ?

14/10/16

what is happening with the 2nd insecticide ?

calc_fit_rs(df_base,'f') allele meanfit R1 R1 0.923 S1 S1 0.856 R2 R2 0.923 S2 S2 0.856 R1-S1 R1-S1 0.066

R2-S2 R2-S2 0.066

calc_fit_rs(df_ef,'f') allele meanfit R1 R1 0.919 S1 S1 0.804 R2 R2 0.885 S2 S2 0.839 R1-S1 R1-S1 0.114

R2-S2 R2-S2 0.045

calc_fit_rs(df_ex,'f') allele meanfit R1 R1 0.897 S1 S1 0.752 R2 R2 0.862 S2 S2 0.788 R1-S1 R1-S1 0.144

R2-S2 R2-S2 0.073

So yes, increasing effectiveness I1 results in a decrease in the difference between the relative fitness of S & R to I2 (from 0.066 to 0.045), this can explain the slower selection for resistance.

In contrast increasing exposure to I1 results in an increase in the difference between the relative fitness of S & R to I2 (from 0.066 to 0.073), this can explain the faster selection for resistance.

Be a bit careful, does my normalising do anything weird ? I might be able to run calc_fit_rs() on the non normalised versions. not normalised calc_fit_rs(df_base_,'f') allele meanfit R1 R1 0.721 S1 S1 0.669 R2 R2 0.721 S2 S2 0.669 R1-S1 R1-S1 0.052

R2-S2 R2-S2 0.052

calc_fit_rs(df_ef_,'f') allele meanfit R1 R1 0.666 S1 S1 0.583 R2 R2 0.641 S2 S2 0.608 R1-S1 R1-S1 0.083

R2-S2 R2-S2 0.033

calc_fit_rs(df_ex_,'f') allele meanfit R1 R1 0.633 S1 S1 0.531 R2 R2 0.608 S2 S2 0.556 R1-S1 R1-S1 0.102

R2-S2 R2-S2 0.052

So in the non normalised version the difference in fitness for S&R I2 stays the same when I1 exposure is increased but decreases when I1 effectiveness is increased.

In text form :

In a mixture, increasing the exposure to one insecticide has very little effect on the selection pressure for resistance to the other. In contrast increasing the effectiveness of one insecticide in a mixture decreases the slection pressure for resistance to the other.

Can I recreate this result by arithmetic ?

selection coefficient = resistance restoration * effectiveness Locus fitness : SS = 1-effectiveness RS = 1-effectiveness + (dominance*selection coeff.) RR = 1-effectiveness + selection coeff. Allele fitness : SS1SS2 = SS * SS SS1RS2 = SS * RS . . . and for other combinations RR1RR2 = RR * RR Population fitness = Allele fitness * exposure to insecticide

Locus fitness using rr : SS = 1-effectiveness RS = 1-effectiveness + (dominanceresistance restorationeffectiveness) RR = 1-effectiveness + (resistance restoration*effectiveness)

f1, f2 : effectiveness x1, x2 : exposure r1, r2 : resistance restoration d1, d2 : dominance fitSS : fitness

fitSS1 = 1-f1 fitRS1 = 1-f1 + (d1r1f1) fitRR1 = 1-f1 + (r1*f1)

fitSS1

eek is this really going to get me anywhere ? I would need to repeat for all of RS1RS2 etc. then I might want to get back to R & S like I've done in my version.

perhaps ask Ian or Matt to look at the arithmetic

17/10/16

~ working on effectiveness_exposure_difference.Rmd ~ nice lining up of fitness plots

~ will need to produce high res figs for PLOS (3) PLOS offers a figure-checking tool, PACE (http://pace.apexcovantage.com/), to help authors to ensure all figures meet PLOS requirements so that the quality of published figures is as high as possible. Please use this tool to help you format your figures. PACE is a digital diagnostic and conversion tool for figure files. It will provide information about any failed check(s) and, if able, will automatically convert the figure file into an acceptable file that passes quality checks. PACE requires you to register for an account to ensure your figure files are processed securely.

Figures uploaded separately as TIFF or EPS files (if you wish, your figures may remain in your main manuscript file in addition).

http://journals.plos.org/ploscompbiol/s/figures

~ TIFF or EPS ~ Width: 789 – 2250 pixels (at 300 dpi). Height maximum: 2625 pixels (at 300 dpi). ~ Resolution 300 – 600 dpi ~ File Size <10 MB ~ Text Within Figures : Arial, Times, or Symbol font only in 8-12 point ~ Fig1.tif, Fig2.eps, and so on. Match file name to caption label and citation. ~ Captions : In the manuscript, not in the figure file.

Tiff is recomended as being easier from the web I got this : library(knitr) opts_chunk$set(dev="tiff", dev.args=list(compression="lzw"), dpi=300)

Global Options To set global options that apply to every chunk in your file, call knitr::opts_chunk$set in a code chunk. Knitr will treat each option that you pass to knitr::opts_chunk$set as a global default that can be overwritten in individual chunk headers

Pandoc problems starting to get some tiffs

I put the figures through Pace and they passed although some of them don't look very good.

trying to work out what the fonts are in ggplot2. It's apparently complicated and depends on your system. Seems it will be one of the ones indicated by this : windowsFonts() $serif [1] "TT Times New Roman" $sans [1] "TT Arial" $mono [1] "TT Courier New"

18/10/16 Place all panels from a multipart figure into a single page and file. If you have a multipart figure spanning multiple files:

Combine multiple panels into one page, or break them apart into separate figures. Re-number all figures and in-text citations accordingly.

To create a multipanel figure from individual files, use a presentation program such as Microsoft PowerPoint. Then convert to TIFF.

To set up the page, use the values listed in Dimensions. Use an Insert tool to place figures. Do not drag/drop or copy/paste images into the file. If your figures have numerous pictures, charts, or small text, they will render best at a resolution of 600 dpi.

failed attempt to create Ians fig1 from the main model. not quite sure why. Not very useful anyway. try to create from v.simple scartch instead.

a <- setExposure(exposure=1, insecticideUsed='insecticide1')

effectiveness1 <- 1 rr_restoration_ins1 <- 1

i1 <- setInputOneScenario( max_gen = 500, h.RS1_A0 = 0, h.RS2_0B = 0, a = a, phi.SS1_A0 = effectiveness1, phi.SS2_0B = 0.5, rr_restoration_ins1 = rr_restoration_ins1, rr_restoration_ins2 = 0.5 )

i2 <- setInputOneScenario( max_gen = 500, h.RS1_A0 = 0.5, h.RS2_0B = 0, a = a, phi.SS1_A0 = effectiveness1, phi.SS2_0B = 0.5, rr_restoration_ins1 = rr_restoration_ins1, rr_restoration_ins2 = 0.5 )

i3 <- setInputOneScenario( max_gen = 500, h.RS1_A0 = 1, h.RS2_0B = 0, a = a, phi.SS1_A0 = effectiveness1, phi.SS2_0B = 0.5, rr_restoration_ins1 = rr_restoration_ins1, rr_restoration_ins2 = 0.5 )

#input <- cbind(input, inputOneScenario) input <- cbind(i1,i2,i3)

listOut <- runModel2( input )

df_resist <- get_resistance(locus=1, listOut)

print( ggplot(df_resist, aes(x=generation, y=resistance, colour=factor(h.RS1_A0))) + theme_bw() +
theme(legend.position = "bottom", legend.key = element_blank()) + guides(colour = guide_legend(reverse=TRUE)) +
labs(colour = "dominance") + #coord_trans(y = "log10") +
geom_line()
)

created this function to help generate fig1 in paper1 : selection_simple()

these are the dimension requirements : ~ Width: 789 – 2250 pixels (at 300 dpi). Height maximum: 2625 pixels (at 300 dpi).

Right click on the tiffs can get dimensions. (but presumably these are checked by the online tool ??) Alt-enter within windows photo viewer

4.55 start trying to create fig 1B, see if I can do in 30 mins.

19/10/16 Bayer purchase order came through 13,650 EUR, hurrah!

These were my travel quotes : So cheaper to register at ASTMH before tomorrow !

Labour. 30 days @ EUR 360 per day. 10,800 ASTMH Registration (before Oct 20th) at member rate $ 605 540 ASTMH membership $ 230 205 ASTMH Accommodation (Sun-Weds) $ 1,250 1,116 ASTMH Travel £ 850 989

http://www.astmh.org/about-astmh/about-us/join-astmh

Fig dimensions in inches : Minimum width 2.63
Maximum width 7.5 or 5.2 to fit in text column. Maximum height 8.75

At the height maximum, the figure occupies the whole page and excludes the caption

“Dimensions” refers to the dimensions of the entire figure, excluding any white space. The closer figures match these dimensions, the closer they will meet expectations on publication.

TIPS

To align your figure with the text column of the PDF version of the article, make it no wider than 5.2 inches.

20/10/16

working on responses to editors for paper1

Note from CUP website : http://www.cambridge.org/rights/permissions/permission.htm

Cambridge University Press grants permission freely for the reproduction in another work of a short prose extract (less than 400 words), a single figure or a single table in which it holds rights (see the important caveat in the Notes below). In such cases a request for permission need not be submitted, but the reproduced material must be accompanied by a full citation of the original source.

In the case of some journals published by Cambridge on behalf of a learned society, permission may need to be obtained directly from the society. Check the cover of the journal before proceeding.

ASTMH booked hotel

21/10/16

I skyped Ian & showed him the effectiveness, exposure genotype fitness comparison. He said he liked it.

He asked whether this was at linkage equilibrium - I didn't quite know how to answer. I talked about how this was at the start of the simulation before any evolution.

I should look into. Will it change when the frequency of resistance changes ? I think not. I think that this is the selection pressure that is exerted every generation.

So is it independent of linkage ?

Also I talked Ian through my calculation of R-S mean fitness for each locus. He didn't say much about it (at least not much negative!). Before doing this I realised it's probably similar to what goes on in the selection() function.

Ian said that he had spoken to IVCC about me developing a 'plain english' resistance modelling paper and they liked the idea.

24/10/16 train to liverpool

reading through paper1 intro & discussion to work out what I can put in paper2

starting to work on paper2 grid figs for exposure & effectiveness difference. Getting there looking good. This will radically reduce num figs in paper2 and make it much easier for the reader to see what is going on (and for me to explain in the legend).

should I stick with 2x2 plots or could I go up to 3x3 ?

2x2 might make it easier to understand for the reader.

grid1 eff1 0.5 0.8 exp 0.5 1 2 0.8 3 4

potential 3x3 eff1 eff2

~ plot of mean popn fitness over time ~~ we have ~~ fitness by genotype : listOut$fitness
~~ genotype by generation : listOut$genotype

BUT the values seem not as I'd expect (all 0s & 1s) for input <- setInputOneScenario(max_gen=5) listOut <- runModel2(input)

but Matt suggested the fitness needs to come from fitness_indiv as in fitnessPrint() SS1.SS2 RS1.SS2 RR1.SS2 SS1.RS2 RS1.RS2 RR1.RS2 SS1.RR2 RS1.RR2 RR1.RR2 ind_m 0.625 0.65625 0.6875 0.65625 0.6953125 0.734375 0.6875 0.734375 0.78125

which does look better ...

listOut$fitness[[1]][,'A,B'] SS1SS2 SS1RS2 SS1RR2 RS1SS2 RS1RS2 RS1RR2 RR1SS2 RR1RS2 RR1RR2 0 0 0 0 1 1 0 1 1

there was a bug here before 24/10/16

listOut$genotype[[1]] gen SS1SS2 SS1RS2 SS1RR2 RS1SS2 RS1RS2_cis RS1RS2_trans RS1RR2 [1,] 1 9.960060e-01 1.994006e-03 9.980010e-07 1.994006e-03 1.996002e-06 1.996002e-06 1.998000e-09 [2,] 2 9.959523e-01 2.011825e-03 1.015973e-06 2.011825e-03 1.995876e-05 2.031946e-06 2.015837e-08 [3,] 3 9.956472e-01 2.118965e-03 1.127410e-06 2.118965e-03 1.100964e-04 2.254820e-06 1.171552e-07 [4,] 4 9.940823e-01 2.673485e-03 1.797517e-06 2.673485e-03 5.619192e-04 3.595034e-06 7.556126e-07 a_fitgen <- fitnessGenotype()
df_indiv <- as.data.frame(a_fitgen)

25/10/16 with Ian & Matt in Liverpool

Ian & Matt explained that if two different AIs on different bednets that is technically a Mosaic.

Mixture (where the molecules are next to each other), individuals are always exposed to both. Mosaic, most are exposed to just one but some are exposed to both (through the correct deployment parameter).

Ian wasn't that keen on the exposure vis and didn't quite get the need to be able to set the exposures differently.

input <- setInputOneScenario(max_gen=200) listOut <- runModel2(input) layout(matrix(c(1:2),2,1)) plot(listOut$fit_time_genotype[,'f','variance'])

seems to be going up over time and not decreasing as Ian & Matt expected ...

can I plot frequency of RR to check ?

plot(listOut$fit_time_genotype[,'f','RR1RR2'])

tst <- as.data.frame(listOut$fit_time_genotype[,'f',])

Ian worked out how the calculation of the variance should be done.

for each sex :

mean fitness = mean( fitness per genotype * genotype frequency )

sum( (fitness of each genotype - mean fitness)2 ) #squared

26/10/16 with Ian & Matt in Liverpool

Matt said : he will start committing his changes to the Matt folder now that he is starting to work with Rmd files he will produce an Rmd of the mosaic work, he is just working out how best to summarise because he has lots of simulations to compare. I said try to produce a sumary at the start & then fine to have more details afterwards.

~ has my removing calibration from runModel2() messed something up in pap1 plotting ? i.e. I'm getting LD etc. plots that I wasn't getting before ... But I thought that it shouldn't make any difference ... check github for previous version and may need to revert calibration <- 1011 input <- createInputMatrix( params.csv=FALSE, calibration=calibration ) listOut <- runModel2( input, calibration ) is it that calibration is saved in input as 1011.00 so maybe that doesn't work in runModel2() ? however this suggests not a problem : 1011 == 1011.00 [1] TRUE Ah yes it was because calibration was put into the produce.plots arg ! listOut <- runModel2( input, calibration ) solution just change to : listOut <- runModel2( input=input )

27/10/16

~ emailed Ian & Matt, asked Matt if he had committed the code to do the rotation plots

~ updated pap1 figs e.g. fignames ~ emailed Ian & Beth about email from PLOS, suggesting that we go without the Curtis copies.

~ have a quick look at creating Linkage disequilibrium plots like the ones on p122 of GPIRM ~~ see yellow notebook notes from 26/10/16

' input <- setInputOneScenario()

' listOut <- runModel2(input)

' plot_ld_gpirm(genotype=listOut$genotype[[1]], gen_num=5)

' plot_ld_gpirm(genotype=listOut$genotype[[1]], gen_num=6)

' plot_ld_gpirm(genotype=listOut$genotype[[1]], gen_num=7)

' plot_ld_gpirm(genotype=listOut$genotype[[1]], gen_num=8)

' plot_ld_gpirm(genotype=listOut$genotype[[1]], gen_num=9)

Seems that I can do something similar. But is it really showing what we want ??

This version doesn't include dominance. Come back to this after back from SA when I have more time.

Fig changes paper1 :

done ~ Fig 1 add A) & B) done ~ Fig 2 add A) & B) potentially add scans of Curtis figs done ~ Fig 3. got input values from Ian, removed mex2 & seq labels because on top of each other done ~ Fig 6 add A) ,B), C) done ~ Fig 8.1 tree 1. Make text less bold. Turn ovals into rectangles. done ~ Fig 11 do I want to keep mix1,2 & seq labels ? I think so.

9/11/16 train back from heathrow after SA

fitness variance : This is where I got to in Liverpool a few weeks back : results stored in listOut$fit_time_genotype

called from runModel2() : listOut$fit_time_genotype <- fit_time_genotype(genotype, a_fitgen)

input <- setInputOneScenario(max_gen=200) listOut <- runModel2(input) layout(matrix(c(1:2),2,1)) plot(listOut$fit_time_genotype[,'f','variance'])

10/11/16 Bayer working on slides for ASTMH

Interesting comment from Justin on Ians ppt the decision tree slide I think this slide could be interesting to spend some time on. For instance it might be worth mentioning – what is the theoretical exposure level of IRS vs LLINs? With universal coverage (and usage) of LLINs is exposure assumed to be high? IRS tends to be more limited in its application (only about 5% of people at risk protected by IRS) – does that mean an assumption of low exposure or are the geographies of discrete mosquito populations so well understood that IRS coverage in one or two districts can be assumed to be high exposure to ‘discrete’ mosquito populations. Is there any such thing as a discrete population anyway?

What does it mean if there is LLIN use and IRS deployment ? I think that question might come.

11/11/16 sent Ian & Matt first draft of my presentation

modifying fig2 again for Ian. He wants in the LD plot #f1png <- readPNG(system.file("documents","pap1_curtis1.png", package="resistance")) #both the ld and the rfreq plots (ld on its own means little with the way axes are labelled) f1png <- readPNG(system.file("documents","pap1_curtis1_both.png", package="resistance"))

recheck figures through PACE : http://pace.apexcovantage.com/ couldn't remember passwd : w77

all passed

Can I upload them for Ian ? Thank you for registering for the PLOS Editorial Manager online submission and review system for PLOS Computational Biology.

see email from July 27 Here is your username and confidential password, which you will need to access the site at http://pcompbiol.edmgr.com/.

I logged in and it didn't seem to have any record of resubmission needed.

14/11/16

problem with fitvis : Warning: Error in : 'as_tibble' is not an exported object from 'namespace:tibble' Stack trace (innermost first):

online it said this should work : install.packages("tibble")

I think stopping return of an object somewhere fixed it.

~ before ASTMH improve my understanding of LD. E.g. looking at GPIRM figure. maybe see Charelsworth 2012 p 422. Ian may have said that LD can lead to higher than expected frequencies of RR & SS whih may lead to less variance in fitness because most of the gametes are in the middle. [I may have got this wrong].

ASTMH

Disease outbreak library for disease categorisation in social media posts. This could be useful for Thibaut For CDC Emergency communication group University of Georgia Zion Tsz Ho Tse, College of Engineering, The University of Georgia He said thet they are still testing in Beta, but they hope to make it open-source in future. Also said he was happy to discuss offline

Tatiana Vorovchenko Nuffield, Oxford Media consultant for WHO

15/11/16

my talk at ASTMH went well, people looked like it was quite hard work for them but when I asked them they said they got it. I had been slightly concerned that I had made things too easy, I don't think I did. For me it showed that there are still opportunities to improve the communication of how these models work.

Next time I could prepare explicit input combinations to show in the UI. Otherwise I get blinded in the headlights and don't know which to show. e.g. there is interest in insecticides with existing resistance.

One question that I didn't answer very well. I think I passed it on to Ian and I'm not sure he answered it too well either. We show that higher exposure and effectiveness lead to more rapid evolution of resistance. How does fit with the conventional wisdom that poor spraying leads to more rapid evolution of resistance ?

I asked Ian afterwards and he referred to our Figure 1. I'm still not quite sure I get it.

Our figure 1 shows how dominance of the resistance allele can increase over time as the insecticide degrades and presumably there could be a similar effect due to poor application.

So I think Ians suggestion is that poor application leads to higher dominance of resistance which in turn leads to faster increase in resistance.

But we don't look at it this way in the model ... So if it is this important perhaps we could change how we present dominance ?

16/11/16 systematic review of 21 vector control tools (VCTs) Yasmin Wiliams, UCSF (Gerry Killeen on author list)

WHO recomendation requires phase 3 community trials

Insecticide resistance seesion, Tessa Knox

Five country study starting in 2009. Little previously known about relationship between resistance and malaria outcomes. Known that plenty of resistance but what are the epidemiological outcomes.

Benin, Cameroon, Kenya, India,

One interesting result (final photo) that pyrethroid resistance increased more slowly in an area with another effective IRS AI, than in an area where nets were used alone (supports our results). Or does it : The development of pyr resistance was slower in areas with pyr IRS than in an area with LLINs only. That doesn't quite make sense to me

Nets continue to provide protection despite resistance.

17/11/16

Hilary Ransom Global decline of malaria But locally some less +ve Burkin Faso since 2006 steady increase in malaria cases

Prevailing view is bednets still work despite WHO 5 countries study : No relationship between pyrethroid resistance and malaria cases Nets still give personal protection where there is resistance

So : A) is resistance not a problem ? or B) is it a time bomb waiting to explode, and we'll get much lower control soon

22/11/16 back from ASTMH

~ skyped Ian, worked on new fig 1b

23/11/16

~ completed copyright request form & sent to ian with request to send to cambridge ~ accepting Matt's pull requests :

https://help.github.com/articles/merging-a-pull-request/

When someone sends you a pull request from a fork or branch of your repository, you may want to merge it locally to resolve a merge conflict or to test and verify the changes on your local computer before merging on GitHub. For more information, see https://help.github.com/articles/checking-out-pull-requests-locally/

seems like I only have the option to review rather than merge the pull request

https://help.github.com/articles/about-pull-request-reviews/

Yes after I accepted in review, then I got option to merge.

I emailed Matt to ask him to fix the example.

Ian asked me to check the exact numbers for the caption to fig 11. I did this by putting a breakpoint towards the end of plotcurtis_f2_generic() and using the Rstudio viewer to look at combmat, amat & bmat to find when f.R1 or 2 exceeds 0.5. Make these changes (from -> to) : 98 -> 100 51 -> 53 71 -> 75 53+75 =128 100

~ emailed copyright form to publisher

25/11/16

~ looking at question about variance calc from Matt

fitness variance : This is where I got to in Liverpool a few weeks back : results stored in listOut$fit_time_genotype

called from runModel2() : listOut$fit_time_genotype <- fit_time_genotype(genotype, a_fitgen)

input <- setInputOneScenario(max_gen=200) listOut <- runModel2(input) layout(matrix(c(1:2),2,1)) plot(listOut$fit_time_genotype[,'f','variance'])

Good skype chat with Ian

~ I suggested Matt create a doc for the initial variance results, if they cahnge it will be easy for him to rerun ~ good to work on paper2 for Malaria journal over next few weeks ~ claim all the time on my contract because Ian thinks he won't be able to claim more after year end on ISSF (although I think my money was coming out of Gates) ~ meet with Ian on Moday after work after his meeting with IVCC, he will be telling them that they risk losing me

28/11/16 on train to liverpool to meet Ian in eve (then for zikaGIS)

~ added optional resistance plot to plot_fit_variance(), plots don't line up exactly due to dof lengths of y-axis labels & the point legends, but pretty close.

1/12/16 ~ catchup call with Justin ~ I am ~ half way through writing a simpler paper for the Malaria journal about the modelling work. ~ development of the 'public health' paper could proceed in tandem, and ideally could refer to the malaria journal paper ~ i think we should explore the 'poor application promotes resistance' angle. If it does that exclusively through dominance we need to show it.

~ applying for LSTM gaming job

9/12/16

~ trying to make some progress on paper2

fighting with fig legends

pty = 's' in the fail version even though I tried to set it to 'm' ??

seems that setting pty='m' in the par statement before plot fixed it

par(mar=c(0, 0, 0, 0), pty='m') #b,l,t,r default c(5, 4, 4, 2) plot( 0, type="n", axes=FALSE, ann=FALSE, pty='m')

done ~ finalise 4 figure format in effectiveness_exposure_difference.Rmd done ~ copy figure across

Ive now done this : eff1 0.5 0.8 exp 0.5 1 2 0.8 3 4

What other 4 panel figs do I want ?

Malaria journal research article submission guidelines :

https://malariajournal.biomedcentral.com/submission-guidelines/preparing-your-manuscript/research-article https://malariajournal.biomedcentral.com/submission-guidelines/preparing-your-manuscript

Abstract

~ not exceed 350 words ~ must include separate sections: Background: the context and purpose of the study Results: the main findings Conclusions: a brief summary and potential implications

References

All references, including URLs, must be numbered consecutively, in square brackets, in the order in which they are cited in the text, followed by any in tables or legends. The reference numbers must be finalized and the reference list fully formatted before submission.

BioMed Central reference style : Article within a journal Smith JJ. The world of science. Am J Sci. 1999;36:234-5.

Article within a journal (no page numbers) Rohrmann S, Overvad K, Bueno-de-Mesquita HB, Jakobsen MU, Egeberg R, Tjønneland A, et al. Meat consumption and mortality - results from the European Prospective Investigation into Cancer and Nutrition. BMC Med. 2013;11:63.

Article within a journal by DOI Slifka MK, Whitton JL. Clinical implications of dysregulated cytokine production. Dig J Mol Med. 2000; doi:10.1007/s801090000086.

how to deal with references in RMD, need a .bib file

http://stackoverflow.com/questions/32946203/including-bibliography-in-rmarkdown-document-with-use-of-the-knitcitations

Simple advice for using rmarkdown with mendeley. https://rosannavanhespenresearch.wordpress.com/2016/02/17/writing-your-thesis-with-rmarkdown-2-making-a-chapter/

12/12/16

paper2 ~ reading '10 simple rules for structuring papers' by Kording ~ the title is the most important bit

Thinking of some good potential titles :

More effective insecticides predicted to speed evolution of resistance when used alone but can delay it when use in a mixture.

Insecticide effectiveness predicted to be most important factor determining whether insecticide mixtures can slow evolution of resistance.

Speed of evolution of Insecticide resistance ...

But also I don't want to spend too long delaying. Might be good to try to get a more straightforwrad general paper out quickly and have paper3 be the really killer one ?

~ rewriting abstract

GPIRM notes

p36 Most cases of resistance in the field are attributable to a few genes of major effect. Therefore, the spread of resistance throughout mosquito populations requires understanding of the evolution of those genes. A resistance gene starts as a rare gene, but, with further exposure to the same insecticide, the frequency of the gene increases until it becomes common in a population (Figure 12). Other factors being equal, resistance is likely to evolve more quickly if it is functionally dominant in field exposures.

p38 about fitness costs Example in Colombia: resistance genes disappeared from the vector population when selection pressure was removed. A report submitted by WHO AMRO regional entomologists showed that, in 2005–2006, resistance to pyrethroids and DDT was identified in An. darlingi. A decision was quickly made to change to fenitrothion, an organophosphate with a different mode of action, for IRS. Rapid implementation of this alternative, which thereby removed the selection pressure, reduced the frequency of resistance. In 2010, susceptibility tests showed that the frequency of resistance genes in the vector population had dropped below the level of detection, and pyrethroids were once again introduced into the IRS programme, albeit on a more limited scale.

p45 IRM strategies can have different effects on resistant vector populations.

Reduce the proportion of resistance (or delay the emergence of resistance) by removing selection pressure. This strategy is based on the assumption that owing to the ‘fitness cost’ (see section 1.2), resistance genes will recede from a vector population if selection pressure is removed. The strategy involves reducing the selection pressure, for example by rotations of different classes of insecticide and mosaic applications (the spatial reduction of use). These strategies aim to encourage or preserve susceptibility.

Continue to kill resistant vectors. This strategy is based on the assumption that if vectors are exposed simultaneously to multiple insecticides and are not killed by the insecticide to which they are resistant, they will be killed by the alternative insecticide. Currently, combination strategies use this approach, as will mixtures once they become available. This strategy aims to manage resistance by killing or reducing the proportion of resistance carriers by the simultaneous or near simultaneous use of alternative classes of insecticides. See Annex 9 for more detailed descriptions of each IRM strategy, including considerations for implementation and associated costs.

p49 The long-term goal of the malaria community is to maintain the effectiveness of malaria vector control. The susceptibility of malaria vectors to the insecticides used in vector control is a global public health good which must be preserved and which is essential for reaching the targets for reducing the malaria burden.1 It is our collective responsibility to act immediately in a coordinated manner against insecticide resistance in order to maximize the effective lifespan of current and future malaria control tools.

Near-term objective. Given the limited number of alternatives to pyrethroids and the current situation of resistance, it will not be possible to maintain susceptibility to pyrethroids forever; therefore, in the near term, all efforts should be focused on preserving the susceptibility of major malaria vectors to pyrethroids and other classes of insecticides, at least until new insecticides have become available for wide-scale vector control.

Achieving this near-term objective will be subject to several requirements. Firstly, IRM strategies should not prevent the implementation and scaling up of vector control; instead, they should support plans for increasing coverage with vector control. Secondly, resistance must be integrated into the cost-effectiveness equation that forms the basis for deciding on vector control interventions at all levels. Thirdly, sustaining the susceptibility of vectors to insecticides will require routine monitoring of the effectiveness of IRM and vector control programmes. It will also be important to monitor potential threats and consider ways to mitigate them. Fourth, success will depend on securing sufficient funding for capacity-building and implementation of IRM strategies. Finally, developing an adequate range of new insecticide classes will require accelerating the research and development of new products and active ingredients.

[note no mention of poor application promotes resistance]

p68 Pillar IV. Fill gaps in knowledge on mechanisms of insecticide resistance and on the impact of current IRM STRATEGIES ... Research agenda: Many questions remain on the efficacy, feasibility and applicability of different strategies for managing insecticide resistance under different circumstances.

p69 Genetics. Background: With limited genetic information on resistance genes, it is difficult to track and anticipate the course of resistance, and understand which IRM strategies would be most effective. The evolution of resistance and the possibility of reducing and even reversing resistance cannot be predicted because of limited information on factors such as baseline frequency (mutation rates), fitness cost, genetic mode of inheritance and the selection pressure due to different uses of insecticides in agriculture and public health. Inability to track resistance genetically makes the consequences of insecticide resistance more difficult to anticipate; it is also difficult to measure the efficacy of IRM strategies. Research agenda: The genes that confer metabolic resistance must be identified in order to answer several important research questions. Outputs from this research agenda would have immediate practical implications for decisions on resistance management taken in national malaria control programmes. The topics for research should include genetic dominance, fitness cost, cross-resistance, linkage disequilibrium, drivers of selection pressure and behavioural resistance. See Annex 10 for more details of the genetic research agenda.

p111 Possible reasons for widespread insecticide resistance with no obvious impact on the effectiveness of vector control. • The effect of resistance may be visible only when combined with poor implementation, gaps in coverage and aging insecticide deposits. IRS and LLINs are useful in vector control not only because they have a powerful immediate impact on transmission, but also because they are often found to be still effective even when they are not implemented perfectly, when coverage is incomplete or when the spray deposits/nets are old in relation to their expected lifespan. In other words, they are robust and long-lived and tolerant of imperfect conditions, and for programmatic purposes this is as important as their immediate impact on transmission. This points to the hypothesis that resistance may erode the robustness of an intervention, so that its impact on vector control may be seen most clearly when implementation is poor, when there are gaps in coverage, or when the intervention gets older. In other words, in the presence of resistance, previously robust interventions may require perfect implementation and complete coverage in order to be effective, and the duration of effective control may be reduced. ...

Annex 9 Use of insecticide resistance management STRATEGIES (rotations, combinations, mosaics and mixtures)

Annex 10 Genetic research agenda, includes some good stuff.

No mention of poor implementation promoting evolution of resistance.

13/12/16

Looking at Ians original figure of change in dominance with insecticide concentration from: Barbosa S, Black WC IV, Hastings I (2011) Challenges in Estimating Insecticide Selection Pressures from Mosquito Field Data. PLoS Negl Trop Dis 5(11): e1387. doi:10.1371/journal.pntd.0001387

It cites an earlier paper related to drug resistance:

Hastings I, Watkins W (2006) Tolerance is the key to understanding antimalarial drug resistance. Trends Parasitol 22: 71–77.

Abstract The evolution of antimalarial drug resistance is often considered to be a single-stage process in which parasites are either fully resistant or completely sensitive to a drug. However, this does not take into account the important intermediate stage of drug tolerance. Drug-tolerant parasites are killed by the high serum concentrations of drugs that occur during direct treatment of the human host. However, these parasites can spread in the human population because many drugs persist long after treatment, and the tolerant parasites can infect people in which there are residual levels of the drugs. This intermediate stage between fully sensitive and fully resistant parasites has far-reaching implications for the evolution of drug-resistant malaria.

From Churcher 2016 For simplicity and following (Griffin et al., 2010) it is assumed that the killing activity of pyrethroid over time (the half-life in years, denoted Hy) is proportional to the loss of morbidity caused by washing (the half-life in washes, Hw). A prior estimate of the half-life in years (Mahama et al., 2007) from a durability study of a non-PBO LLIN with susceptible mosquitoes(Hs y) is then used to reflect changes caused by pyrethroid resistance ... Following Griffin et al., 2010 it is assumed that the activity of the insecticide decays at a constant rate according to a decay parameter gp, which is related to the half-life.

14/12/16 ~ working on paper2

15/12/16 ~ working on paper2

From IRAC IRM manual (2011) http://www.irac-online.org/documents/irm-vector-manual/?ext=pdf

3.5.1 Fitness cost Populations of insects that have never been exposed to insecticides are usually fully susceptible, and resistance genes within those populations are very rare. This is usually due to a “fitness cost”, which means that insects possessing the resistance gene lack some other attribute or quality such that it gives an advantage to the susceptible insects in the absence of the insecticide. Differences in the number of offspring, longevity or overall robustness are often found in resistant insects. There is good laboratory and field evidence to suggest that the absence of selection pressure, in the form of insecticide treatment, in most cases, selects for susceptible insects. Resistant colonies in the laboratory often revert to susceptibility if the insecticide selection pressure is not maintained. Similarly once resistance in the field has been selected it often rapidly reverts once the insecticide treatment regime is changed. A good example of this occurred in Anopheles arabiensis in Sudan, where malathion specific insecticide resistance was selected in the early 1980s through antimalarial house spraying. The development of resistance prompted a switch of insecticide treatment to fenitrothion and the malathion resistance rapidly reverted in the following years.

It is this reversion to susceptibility which is the underlying assumption behind any effective resistance management strategy. However, reversion rates are variable and may be very slow, particularly when an insecticide has been used for many years. If there is no fitness cost for the resistance mechanism there is no reason for the resistance genes to be lost in the population and for resistance to fully revert. For example, DDT was used extensively for malaria control over a 20 year period up to the 1960s in Sri Lanka to control Anopheles culicifacies and Anopheles subpictus. DDT was replaced by malathion in Sri Lanka in the early 1970s when a total and effective ban on DDT use was implemented. Subsequent regular monitoring has shown that DDT resistance has reverted very slowly towards susceptibility. Around 80% of the adult mosquito population was resistant in the 1970s compared to about 50% in the 1990s. This rate of reversion is clearly too slow to establish any effective resistance management strategy involving the reintroduction of DDT.

3.6 Major factors that influence resistance development

3.6.1 Frequency of application More frequent = faster resistance

3.6.2 Dosage and persistence of effect More persistent = faster resistance

3.6.3 Rate of reproduction Faster reproduction = faster resistance

3.6.4 Population isolation More isolated (i.e. fewer susceptible immigrants) = faster resistance

So no mention of poor application promoting resistance.

Aha! but end of p22. Insecticide resistance develops in an insect population when individuals carrying genes that allow them to survive exposure to the insecticide pass these genes on. Thus, any activities that control the individuals with the resistance trait will delay the spread of the resistance genes in the population. IRM in the context of IVM therefore also includes activities such as habitat management, community education and mosquito larviciding.

Mosquitoes with reduced susceptibility to an insecticide may still be controlled at the recommended label rate. However, exposure to sub-label rate applications may allow these individuals to survive and pass on the resistance genes. Sub-lethal exposure may arise in IRS due to poor choice of product, under dosing during application or poor application technique. In each case the residuality of the product may not be sufficient, delivering a sub-lethal dose before the next scheduled spray round. LNs may also deliver sub-lethal doses within their expected lifetime due to poor product choice, inappropriate storage, use or washing. These factors which reduce the efficacy of a vector control programme, can lead to a shift in the susceptibility status of the mosquito population, and should be avoided through informed product choice, effective IRS application and LN distribution, and education.

16/12/2016 * From Liu(2012) Insecticide resistance has been proposed to be a preadaptive phenomenon in the sense that, prior to an organism’s exposure to a stressor (in this case, a mosquito’s exposure to an insecticide), there already exist rare individuals who carry one or more resistance alleles (such as polymorphisms in the resistance allele sequence or increased expression of the resistance allele) that allow them to survive exposure to that stressor (139, 149).

19/12/16 on train home for xmas looking at plot_ld_gpirm(), with dominance may not be quite as straightforward as I thought.

TODO this seems to expose an error in my plotting routine OR can this plot idea not cope with our data SEE whitespace between generations 20 & 40

exposure 0.5, eff1 0.8, eff2 0.5, dominances 1

a <- setExposure( exposure=0.5, insecticideUsed = 'mixture' ) input <- setInputOneScenario( a=a, phi.SS1_A0=0.8, phi.SS2_0B=0.5, h.RS1_A0=1, h.RS2_0B=1) listOut <- runModel2(input) plot_ld_gpirm(genotype=listOut$genotype[[1]], gen=seq(from=5, to=50, by=5))

plot_ld_gpirm(genotype=listOut$genotype[[1]], gen=seq(from=20, to=40, by=4))

i think this is the equiv resistance freq plot for comparison

runcurtis_f2( exposure=0.5, phi.SS1_A0=0.8, phi.SS2_0B=0.5, h.RS1_A0=1, h.RS2_0B=1)

Responses to my email does poor application promote resistance :

Ian and I are working on a simpler IR modelling paper to follow up on our first one. The idea is to target this at non-modellers probably in the malaria journal.

As a part of that I want to address the question that came from the audience in my talk at ASTMH and I've come across before.

There appears to be a general perception that poor implementation of an intervention is likely to promote resistance. Is this written down anywhere ?

Our model can (if we don't consider dominance effects), suggest the opposite, that poor implementation should reduce the selection pressure for resistance by reducing exposure to or effectiveness of the insecticide.

I have just been through GPIRM and it is not mentioned.

Can anyone point me to any references suggesting that 'poor' interventions promote resistance ?

Sarah Rees, IVCC

Good question, and an important one. I assume by ‘poor’ intervention you mean insufficient exposure, i.e. the a.i. is not encountered by the insect at an effective dose rate. It is a ‘universal truth’ that if an organism is exposed to a sub-lethal dose of a killing agent it will more easily be able to develop resistance mechanisms, I assume because it can remain alive and the necessary mutations can be selected and survive. So if either an insufficient dose is applied, eg. through prior dilution or uneven application of an IRS, equally the levels of an a.i. may decrease before the following treatment. The logic of always finish a course of antibiotics is the same, if the a.i. reduces below the effective dose at any point then insects are going to be able to more easily develop resistance. I can’t give you a ready to hand reference, but I imagine there will be something somewhere on the IRAC website http://www.irac-online.org/teams/public-health/, or you might call up Mark Hoppe who is the lead of the public health IRAC group, his email is mark.hoppe@syngenta.com (you can say I suggested him).

Nick Hamon, IVCC

A lot of work was done in the 1990s in ag….looking at sub-lethal doses of pesticides on resistance development. Sub-lethal can come from poor application, poor formulations (cheap generics perhaps), under-application/dosing (to save money)….. and from everything I have read this leads to an acceleration of the development of resistance. I suspect a quick literature search will pull out everything you need here, and more. I believe this has been demonstrated with insecticides, fungicides and herbicides

Justin

http://www.fao.org/fileadmin/templates/agphome/documents/Pests_Pesticides/Code/FAO_RMG_Sept_12.pdf

See page 12.

I have never been particularly convinced about this topic – if you apply a less effective intervention then the selection pressure declines, meaning you have the potential for more diversity to remain in the gene pool (right?). I guess the line about only leaving heterozygous resistance genes at lower dose rates makes sense but then no VC interventions target 100% of the population anyway because male mosquitoes don’t seek blood meals and therefore shouldn’t be exposed to the same selection pressures... Which, to my mind points more and more to environmental exposure (eg. role of agricultural insecticide usage) in certain situations.

From FAO2012 : Guidelines on Prevention and Management of Pesticide Resistance

A good accessible report, which has nice entry level descriptions of resistance that I might like to use.

p5-6 Certain pest control practices have consistently been shown to exacerbate the loss of susceptible pest populations and the development of resistance.

These include: • continued and frequent use of a single pesticide or closely related pesticides on a pest population; • the use of application rates that are below or above those recommended on the label; • poor coverage of the area being treated; ...

The objective of resistance management is to prevent or at least slow the accumulation of resistant individuals in pest populations, so as to preserve the effectiveness of available pesticides. Resistance management can also be thought of as susceptibility management, as the aim is to maintain a high percentage of susceptible genes within the pest population while keeping genes for resistance at a minimum. The challenge is to reduce the selection pressure for resistance while providing the necessary level of crop protection.

If the principle of resistance management is relatively simple, putting it into practice for a given crop or pest is often not. There is unfortunately no single resistance management prescription that can be applied globally to all pesticides, pests, and crops. Nor is resistance solely a technical problem that can be readily overcome with the right new pesticide with a new mode of action, or an adjustment in the way conventional pesticides are used.

p8 Resistance alleles can range from dominant through semi-dominant to recessive. If dominant or semi-dominant, only one parent need possess the characteristic for it to be fully or partially expressed in the offspring. If recessive, both parents must possess the trait. Fortunately, most resistance mechanisms are controlled by recessive or semi-dominant alleles, which slows their spread within the population.

The genetic trait that allows the organism to survive exposure to the pesticide will be found in one or both of the gene’s alleles. When the trait is in both alleles (written RR), the pest is homozygous resistant; the pest will likely be highly resistant to the pesticide and will pass on one resistant allele (R) to its offspring. If the offspring also receive an R from their other parent, they too will be RR. If the trait for resistance is found in just one of the gene’s alleles (RS), the pest is heterozygous resistant; the pest will be less resistant to the pesticide, and may or may not pass on the gene for resistance to its offspring. Individuals that are homozygous susceptible, SS, are susceptible to the pesticide.

p11 The factors that affect resistance development can be grouped into three categories: the pest’s genetic make-up, the pest’s biology, and “operational factors” including cropping practices and the pesticide characteristics and application (see Table 1). While it is not possible to precisely predict the development of resistance to a particular compound, it is possible to assess the risk generally by evaluating these factors for each pesticide-pest-crop situation.

p12 table of potential for resistance development

Pesticide application rate Lower potential for resistance Label rate; heterozygotes killed (If R gene is incompletely dominant)

Higher potential for resistance Less than label rate: heterozygotes survive More than label rate: Only some homozygous resistant individuals survive and reproduce (especially if there is little immigration)

Application coverage [but there is no justification for why ...] Lower potential for resistance Good coverage Higher potential for resistance Poor coverage

p14 In situations where resistance is operationally recessive only a few homozygous resistant (RR) individuals will survive after treatment with an insecticide. As homozygous susceptible (SS) individuals move into the area and mate with the survivors many offspring will be heterozygotes (RS) or (SS) susceptible individuals. If a treatment is made and the proper dose is used and good application coverage is achieved, the SS and most if not all of the RS individuals will be killed. However, if a reduced rate is used and/or coverage is poor, subsequent applications can result in the survival of many RS individuals and result in faster selection of a resistant population.

p16 dominance In insects, incompletely recessive or dominant genes can be made functionally dominant when the individuals carrying those genes are exposed to reduced rates of the pesticide. This lower dose can result from the deliberate use of a low rate, inadequate coverage of the plant or area being treated, or exposure to pesticide residues that are degrading on the treated surface. When this occurs, heterozygote individuals survive and pass on the resistant gene ...

p17 Protection provided by the “R” gene [this is our resistance restoration] If the resistance gene provides a high degree of protection from the pesticide, then individuals carrying that gene have a very high probability of surviving a pesticide application and passing the resistance gene on to the next generation. However, if the resistance gene provides only a moderate level of protection, then the individuals carrying the resistant gene will be protected from lower doses of the pesticide but not high doses. This is another reason to ensure that full label rates of a pesticide are used and that the best coverage possible is achieved. Lower doses and poor coverage permit the accumulation of the resistance genes in the population.

p18 Application rate Although pesticide application rates are not set with regard to resistance, it is important to apply the recommended rate and not underdose. Ideally this rate should eliminate all susceptible and essentially all heterozygous resistant individuals from the pest population while reducing pest numbers below the economic threshold. If the dose is too low, the susceptible individuals will be eliminated but the partially resistant heterozygotes will survive. A dose that is too low will also have the effect of making the resistance gene functionally dominant and resistance may develop rather quickly. However, attempting to eliminate heterozygote individuals is most effective if the population is not extremely large, consists mostly of susceptible individuals, and is subject to immigration by susceptible individuals; then highly resistant homozygous resistant individuals should be rare and will likely suffer from reduced fitness because of the resistance genes.

The use of higher than recommended application rates is not recommended either. This is because if there are any survivors from a high rate, these are likely to be mainly homozygous resistant. In particular when there is no immigration of susceptible individuals, high dose rates are then very likely to increase the development of resistance.

[ ??? p18 seems to have contradictory logic underdosing leads RS to survive so is bad overdosing just leaves RR so is bad ]

[coverage bit also doesn't make complete sense to me ...] Coverage If coverage is good, with the correct amount of pesticide applied to the entire area, the pests will more likely encounter the desired, lethal rate. If coverage is poor, with some areas receiving more pesticide and others less or none at all, the result will be similar to what happens when below-label rates are used. Homozygous [RR] individuals will be selected and the development of resistance will be promoted.

p27 Table 8 Factors affecting resistance development in insects lower than label use rate & inadequate coverage both increase survival of RS and thus frequency of R.

Looking at Fig8 for paper1 for Ian, PLOS want font size increasing, try upping cex from 0.4 in final prp() to 0.6 ... and forced \n into the split labels

4/1/17 Looking at formatting for Malaria journal : https://malariajournal.biomedcentral.com/submission-guidelines/preparing-your-manuscript

Preparing main manuscript text

Use double line spacing Include line and page numbering Do not use page breaks in your manuscript File formats Microsoft word (DOC, DOCX) Rich text format (RTF) TeX/LaTeX (use BioMed Central's TeX template)

editable files are required for processing in production. If your manuscript contains any non-editable files (such as PDFs) you will be required to re-submit an editable file if your manuscript is accepted.

Note that figures must be submitted as separate image files, not as part of the submitted manuscript file. For more information, see Preparing figures below.

So my option is to try to get working with their tex template (I'm not sure how I would do), or to try to get Rmarkdown to output a doc or rtf.

For the next stage of submitting to people maybe keep going with the pdf. OR maybe switch to google docs now for the manuscript with the tables & figures generated from RMarkdown ?

Knitting to word works OK except the Figure legends not quite right, they don't get the Figx bit of label.

Figure rules ~ separate files, not embedded in the main manuscript file. ~ each a single file that fits on 1 page in portrait format. ~ Tables should NOT be submitted as figures but should be included in the main manuscript. ~ Multi-panel figures should be submitted as a single composite file. ~ titles (max 15 words) and legends (max 300 words) should be provided in the main manuscript. ~ keys should be incorporated into the graphic, not into the legend of the figure. ~ closely cropped to minimize the amount of white space surrounding the illustration. ~ Individual figure files should not exceed 10 MB.

Figure file types EPS (suitable for diagrams and/or images) PDF (suitable for diagrams and/or images) TIFF (suitable for images) PNG (suitable for images)

So can I get Rmarkdown to generate a doc for the text and tiffs for each figure, as I did for paper1.

seems I can add this to the YAML for doublespace & linenums header-includes: - \usepackage{setspace} - \doublespacing - \usepackage{lineno} - \linenumbers

But didn't work when I knitted to Word.

Possible paper2 protocol

~ keeping knitting to pdf to send final drafts to Ian & collaborators ~ for submission MS knit to word & delete the figs from the docx ~ for submission figs knit with the first chunk in the Rmd uncommented

10/1/17 paper2

~ had a quick look at tweaking recombination param from 0.5 to effect linkage ? e.g. see Roush(1998) p1782 (also Mani 1985 covers recombination) If genes were close together on chromosome then recombination would be expected to be below 0.5. But what are the chances of that by chance ? recomb_rate

base scenario with effectiveness I1 0.8

mixture best

as above but lower recomb_rate from 0.5 to 0.1

NO NOTICEABLE DIFFERENCE

runcurtis_f2( recomb_rate=0.1, max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.8 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') )

as above but lower recomb_rate from 0.5 to 0.01

MIXTURE STILL BEST but does get a bit faster as curve for 2nd insecticide gets steeper, closer to steepness for 2nd use in sequence

runcurtis_f2( recomb_rate=0.01, max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.8 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') )

MIXTURE STILL BEST

runcurtis_f2( recomb_rate=0.001, max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.8 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') )

So even at very low recombination mixtures can be better. Maybe come back to this in a follow-up LD paper.

12/2017 after paper review : fig 8D in paper2 with a new & old insecticide, mix slower

runcurtis_f2( recomb_rate=0.5, max_gen=500, P_1 = 0.001 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.8 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('s','','a','m'), ylab="", ylabs = FALSE, cex.axis = 0.8, addLegend=FALSE, main='', maxX = 150, labelMixSeqRatio = 1 )

at recom rate 0.01 mixture still slower than sequence

runcurtis_f2( recomb_rate=0.01, max_gen=500, P_1 = 0.001 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.8 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('s','','a','m'), ylab="", ylabs = FALSE, cex.axis = 0.8, addLegend=FALSE, main='', maxX = 150, labelMixSeqRatio = 1 )

at recom rate 0.001 mixture sped up so its now same as sequence

runcurtis_f2( recomb_rate=0.001, max_gen=500, P_1 = 0.001 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.8 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('s','','a','m'), ylab="", ylabs = FALSE, cex.axis = 0.8, addLegend=FALSE, main='', maxX = 150, labelMixSeqRatio = 1 )

Interestingly changing the recomb rate barely effects the difference between the rate of rise of the new insecticide in the presence and absence of the old one. Which is the part suggested in GPIRM. Instead lower recombination makes the curve for the old insecticide in a mixture steeper (closer to that when it is used alone), but only after the new insecticide has 'worn off'.

Maybe I should setup a UI to show this before VCWG 2018.

From Liu 2015 Insecticide resistance has been proposed to be a preadaptive phenomenon in the sense that, prior to an organism’s exposure to a stressor (in this case, a mosquito’s exposure to an insecticide), there already exist rare individuals who carry one or more resistance alleles (such as polymorphisms in the resistance allele sequence or increased expression of the resistance allele) that allow them to survive exposure to that stressor (139, 149).

23/1/17 paper2 done ~ added ratio mix/seq to plots done ~ sort s,m labels on plots done ~ improve figure legends

1/2/17 Martin Donnelly lecture

~ LSTM strategy translational research ~ Most of known resistance markers have proceeded to fixation and are therefore of limited predictive value. ~ Found that a small number of genes are consistently over-expressed in insecticide resistant insects. ~ Now they have 6 good markers associated with insecticide resistance. ~ 1000 anopheles gambiae genomes project (Ag1000G) ~ Huge variation in mosquito genomes ~ Likely that any insecticide resistance mechanism is already out there and doesn't need mutation to arise. ~ are resistance associated variants common across all areas or specific to particular locations ~ selective swept regions (selection sweeps out genetic variance) ~ soft versus hard selective sweeps ~ 10 independent mutations of kdr resistance to DDT & PYR ~ now have ~25 markers of resistance ~ moving towards creating risk maps for resistance, based upon genotypes ~ emebedding genomics into intervention programs to see if can detect evolution of resistance happening ~ he talked about using genomics to detect how resistance is evolving and use this to suggest how that rise in resistance can be mitigated ~ Mark Paine asked about new insecticides entering the market with naive populations, Martin said they are involved with sumishield trials and they should be able to make suggestions about how resistance is likely to evolve ~ Martin said likely that we need to be more creative with mosaics and combinations rather than just using mono-therapies ~ in Uganda cotton growing areas, agriculture may be important in driving emergent resistance

pre-print of ag1000G results biorxiv.org/content/early/2016/12/22/096289

7/2/17 editing Ians draft plan for IVCC file:///C:/Dropbox/resistance LSTM/ivcc2017/Plan of work andy.docx

I want to move up the priority list developing and maintaining the code.

To get total lines in a git repo (excludes binaries but includes blanks)

git diff --shortstat 4b825dc642cb6eb9a060e54bf8d69288fbee4904 162 files changed, 51032 insertions(+)

so c 50k lines of code on the github resistance repo page, go to contributors and can see for me 85k additions and 35k deletions

21/2/17 working on fig1 for paper2 (plot_fit_calc()), getting there.

22/2/17 ~ check that cost of resistance is calculated the way I indicate with dominance ~ first in the code then see what we said in paper1

Ian said that the two dominances should be different and can be called dominance of resistance and dominance of cost.

fixed weird fig thing due to a single % within a figure legend ?

checking on dominance in the model, its within fitnessSingleLocus()

#exposure 0 'no'
a_fitloc[ paste0('RS',locusNum), 'no'] <- 1 - (a_dom[locusNum, 'no'] * a_cost[locusNum])
a_fitloc[ paste0('RR',locusNum), 'no'] <- 1 - a_cost[locusNum]

for( exposID in c('lo','hi') )
{
  a_fitloc[ paste0('SS',locusNum), exposID] <-  1 - a_effect[locusNum, exposID]

  a_fitloc[ paste0('RS',locusNum), exposID] <- (1 - a_effect[locusNum, exposID]) + 
    (a_dom[locusNum, exposID] * a_sel[locusNum, exposID])

  a_fitloc[ paste0('RR',locusNum), exposID] <- (1 - a_effect[locusNum, exposID]) + 
    (a_sel[locusNum, exposID])

so it seems it can cope with these being different : dominance_restoration : a_dom[locusNum, exposID] dominance_cost : a_dom[locusNum, 'no']

where are they set ? in runModel2() a_dom[1,'no'] <- input[32,scen_num] a_dom[1,'lo'] <- input[33,scen_num] a_dom[1,'hi'] <- input[34,scen_num] a_dom[2,'no'] <- input[35,scen_num] a_dom[2,'lo'] <- input[36,scen_num] a_dom[2,'hi'] <- input[37,scen_num]

and from setInputOneScenario() input[ 32 ] <- h.RS1_00 input[ 33 ] <- h.RS1_a0 input[ 34 ] <- h.RS1_A0 input[ 35 ] <- h.RS2_00 input[ 36 ] <- h.RS2_0b input[ 37 ] <- h.RS2_0B

so also there is the potential for dominance to be different in 'lo'

cost: # a_cost = fitness cost of resistance allele in insecticide free environment a_cost[1] <- input[42,scen_num] a_cost[2] <- input[43,scen_num]

input[ 42 ] <- z.RR1_00 input[ 43 ] <- z.RR2_00

Oooo cost-of-resistance stops resistance reaching fixation, I hadn't expected that ! e.g. even with cost set fairly low at 0.25 and other params at 0.5 the resistance frequency stabilises at 50%.

What happens if I modify dominance ?

Cost is interesting and seems to nullify the benefit of mixtures, even with a relatively small cost to just 1 of the insecticides.

add description in the text of fig 1.

see paper2/cost_of_resistance.Rmd

23/02/17

considering fitness costs :

Ian said :

In paper 1 we said this, but then didn't explicitly mention that we kept fitness costs to 0 in the analyses. Fitness costs of resistance may arise in mosquitoes which do not make contact with the insecticide. This reflects the possibility that the metabolic changes that enable IR may have deleterious effects on their normal metabolic function (e.g. [25, 26]). We therefore allow the option of fitness costs i.e. by setting z>0 in column ª-ºof Table 2 and these costs may exhibit different levels of dominance quantified by the associated `h' parameter; fitness costs can be easily ignored (i.e. z = 0) if they are believed, or assumed, to be absent.

These are the cost of resistance refs : Rivero A, VeÂ zilier J, Weill M, Read AF, Gandon S. Insecticide Control of Vector-Borne Diseases: When Is Insecticide Resistance a Problem? PLoS Path. 2010; 6(8):e1001000.

Kliot A, Ghanim M. Fitness costs associated with insecticide resistance. Pest Manage Sci. 2012; 68:1431±7.

Rivero 2010 is good : From the abstract. Insecticide resistance is generally considered to undermine control of vector-transmitted diseases because it increases the number of vectors that survive the insecticide treatment. Disease control failure, however, need not follow from vector control failure. Here, we review evidence that insecticide resistance may have an impact on the quality of vectors and, specifically, on three key determinants of parasite transmission: vector longevity, competence, and behaviour. We argue that, in some instances, insecticide resistance is likely to result in a decrease in vector longevity, a decrease in infectiousness, or in a change in behaviour, all of which will reduce the vectorial capacity of the insect. If this effect is sufficiently large, the impact of insecticide resistance on disease management may not be as detrimental as previously thought. In other instances, however, insecticide resistance may have the opposite effect, increasing the insect’s vectorial capacity, which may lead to a dramatic increase in the transmission of the disease and even to a higher prevalence than in the absence of insecticides. Either way—and there may be no simple generality—the consequence of the evolution of insecticide resistance for disease ecology deserves additional attention.

In their toy model in the appendix they use a cost (c) of 0.2 : The insecticide is assumed to reduce the fecundity of susceptible vectors (larvicide) and measures the coverage of insecticides ( varies between 0 and 1). In contrast, resistant vectors (denotes the density of resistant vectors) do not suffer from the effect of insecticides but pay a cost of resistance on adult survival.

Insecticide resistance could reduce vector transmission of disease (e.g. by reduced lifespan of the vector) or increase transmission (e.g. by reduced immunity of the vector to the disease) [@Rivero2010].

Good concluding paragraph : In some instances, insecticide resistance may impair the ability of the vector to transmit diseases. If this effect is sufficiently large, the impact of insecticide resistance on disease management may not be as detrimental as previously thought. If so, current paradigms might be leading to a misallocation of research and control resources. We contend that there are surprisingly few well-documented cases of disease outbreaks in response to the evolution of insecticide resistance (in marked contrast to the well-documented public health problems caused by the evolution of drug resistance). Alternatively, insecticide resistance could improve the individual vectorial capacity of insects, further emphasising the urgent need for novel insecticides and resistance management strategies. Either way—and there may be no simple generality—the consequence of the evolution of insecticide resistance for disease ecology deserves additional attention.

6/3/17 back from Mexico put cost figs into paper2 and modify text

7/3/17 done ~ I probably need to change fig1 to 1-dominance because thats the way that it goes.

stackoverflow question :

A while ago I wanted a simpler way of creating multidimensional arrays with named dimensions.

I ended up writing a function that has worked really well for me but I worry it may be somewhat of a hack and that there may be a better way of doing. So before passing this on to a colleague I'm seeking advice here.

What I want to do is to be able to create multi-dimensional arrays where the name of a dimension is specified by the name of a vector and the names of the elements are specified by the contents of that vector. e.g.

sex <- c("F","M") name2 <- c("a","b","c")

This can be written out like so.

dimnames1 <- list( sex=sex, name2=name2 ) dim1 <- sapply(dimnames1, function(x) length(x)) a <- array(0,dim=dim1, dimnames=dimnames1) a

name2 sex a b c F 0 0 0 M 0 0 0

But I wanted to be able to keep this more compact :

I wrote this function that enables that.

array_named <- function( ...) {

listArgs <- as.list(match.call()[-1])

#works only if args are specified by actual ranges not by a varname
#dimnames1 <- lapply(listArgs,eval)

#works, I'm not sure why n=3
dimnames1 <- lapply(listArgs,function(x){eval.parent(x, n=3)})

#setting dimensions of array from dimnames1
dim1 <- sapply(dimnames1, function(x) length(x))

#creating array and filling with fill value
a <- array(0, dim=dim1, dimnames=dimnames1)

return(a)

}

This allows passing vectors by name : array_named( sex=sex, name2=name2 )

name2 sex a b c F 0 0 0 M 0 0 0

and directly e.g. array_named( a=c(1,2), b=c('x','y') )

b a x y 1 0 0 2 0 0

Are there problems with this is there a more sensible way of doing ?

submitted stackoverflow question about createArray2()

looking at Ians rotation code

done ~ fig 4B I want cost 0,0.1,0.2 done ~ add cost to table of inputs

8/3/17 paper2 done ~ add cost fig 9

The dominance of cost thing confuses me and I don't want to confuse readers with something that is not the focus and perhaps we don't know very much about. Therefore I may want to take dominance of cost out of paper2.

This used to be in Table 2. Effect of inputs on resistance when insecticides used singly or in sequence

Cost of resistance | slower | reduced fitness of resistants in absence of insecticide
Dominance of cost | slower | increased fitness of heterozygotes in absence of insecticide

I'm suggesting that Dominance of cost slows evolution of resistance by increased fitness of SR (whereas Cost of resistance slows resistance by decreased fitness of RR)

~ add cost & dominance of cost to tables 2-4 ~ change order of resistance restoration and dominance of resistance in tables 2-4 ~ change dominance to dominance of resistance throughout paper

Gave Ian draft to review.

15/3/17

Ian suggested Parasites and Vectors as a potential journal for paper2 Impact Factor: 3.234 Impact Factor: 3.079 malaria journal

submission guidelines : https://parasitesandvectors.biomedcentral.com/submission-guidelines/preparing-your-manuscript/research

seems like submission guidelines are similar to malaria journal. similar 350 word limit for abstract.

from a while ago ~ find a quote that poor implementation of insecticide interventions promotes resistance ... ~ gpirm ? no ~ iarc2011 - yes ~ FAO2012 - yes, good

21/3/17 ~ addressing Ians comments on paper2 ~ re-red '10 simple rules for structuring papers' by Kording ~ tweaked abstract to get under 350

24/3/17 ~ getting draft of paper2 ready to send to external people

11/4/17 ~ got line numbers working to make it easier for people to comment ~ advice on writing manuscripts from rmarkdown http://svmiller.com/blog/2016/02/svm-r-markdown-manuscript/

output: pdf_document: citation_package: natbib keep_tex: true fig_caption: true latex_engine: pdflatex template: ~/Dropbox/miscelanea/svm-r-markdown-templates/svm-latex-ms.tex title: "A Pandoc Markdown Article Starter and Template" thanks: "Replication files are available on the author's Github account..." author: - name: Steven V. Miller affiliation: Clemson University - name: Mary Margaret Albright affiliation: Pendelton State University - name: Rembrandt Q. Einstein affiliation: Springfield University abstract: "This document provides an introduction to R Markdown, argues for its..." keywords: "pandoc, r markdown, knitr" date: "r format(Sys.time(), '%B %d, %Y')" geometry: margin=1in fontfamily: mathpazo fontsize: 11pt

spacing: double

bibliography: ~/Dropbox/master.bib biblio-style: apsr

BUT when I looked at the tex file it is designed to be copied and then edited rather than being specified as a template from my file as I expected

In test_linenumbers.Rmd this does work to make double spaced and add line numbers : header-includes: - \usepackage{setspace} - \doublespacing - \usepackage{lineno} - \linenumbers

Hello All,

Ian and I would be grateful if you have time to look at the attached draft manuscript.

This is intended to be an accessible account of the evolution of insecticide resistance aimed at those with little knowledge of population genetics or modelling.

It follows from our recent longer paper in PLOS Computational Biology and the presentation we gave at ASTMH. It focuses on developing a mechanistic explanation of why mixtures or sequences are likely to be favoured in different situations. We add consideration of resistance cost that was not included in the previous paper.

We plan to submit to the Malaria Journal but other suggestions are welcome where we might access those outside of the malaria community interested in resistance too.

Comments can be annotated on the pdf or referencing line numbers.

If you could get these back to us in 2 weeks (by Wednesday April 26th) that would be appreciated.

Best wishes, Andy

21/4/17

~ working on simpler version of plot_fit_calc() figure for the pubHealthJournal article. ~ can I have as an arg to the function (e.g. simple) ?

done ~1 y axis fitness low to high instead of numeric done ~2 x axis change SS & RR to susceptible & resistant done ~3 remove 0,1 labels on plot done ~4 change dominance of resistance and cost to just dominance

8/8/17 remember that ians new rotation code is elsewhere in a different github repo called rotations we should make sure that this model and that one give same results for a simple 2 insecticide rotation.

From ffrench-constant(2013) Thus the ability to amplify resistance-associated genes from historical pinned specimens of the Australian sheep blowfly allowed Hartley et al. (2006) to show that the mutations conferring resistance to malathion (but not to diazinon) were already present in 21 pinned specimens collected before the introduction of the organophosphorus insecticides themselves.

9/8/17 Took out this sentence from abstarct of paper2 : We look principally at the ability of the insecticides to kill susceptible mosquitoes, how much resistance counteracts this, the proportion of mosquitoes that are exposed to insecticides and costs of resistance.

see paper2_abstract_reworking.doc

10/8/17 ffrench-constant 2017 paper on 'does resistance really carry a fitness cost'

intro : Our ability to manage xenobiotic resistance (both to drugs and pesticides), relies on the ‘alternation’ (or ‘mixture’) of classes of compound with differing modes of action. Management strategies using such alternation of differing chemical classes assume that resistance to compound A will decline during the subsequent use of compound B. This assumption is based on the prediction that de novo resistance to compound A will carry a fitness cost and that the frequency of resistance to A will therefore decline while compound B (or no compound) is used instead. This assumption, that resistance carries a cost in the absence of the xenobiotic, is therefore central to current resistant management strategies in both agriculture (pesticide resistance) and medicine (antibiotic resistance and cancer tumour drug resistance). Despite the widespread reliance on such predicted fitness costs to decrease the frequency of xenobiotic resistance, and an ample literature on the subject, the documentation of such costs is in fact fraught with technical difficulty.

Numerous case studies of fitness costs attributed toinsecticide resistance have been recently and comprehensively reviewed elsewhere [5]. A review of this review suggests to us several basic rules for experiments designed to study the fitness costs of resistance. First and foremost, if resistance is defined as a genetic change leading to control failure in the field, then resistant strains should be both field derived and the costs of resistance should be studied in the field. Experiments on chronically selected resistant laboratory strains or on field collected strains tested in the laboratory, cannot really tell us much about likely fitness costs in the field. Second, the field collected strains that are compared should be both of known resistance genotype (homozygous susceptible SS, homozygous resistant RR or heterozygous RS) and should be compared in a similar genetic background (usually achieved by back-crossing resistance into a known susceptible background). Finally, if an experiment is conducted in the field, then ideally the resistant and susceptible strains should be competed directly against one another. If we apply these simple genetic criteria to the plethora of studies on fitness costs in the literature then very few studies pass all three of these tests.

end of discussion : However what is clear is that if the costs of resistance are small or non-existent then resistance management strategiesthat rely on alternations will not work in the longer term. Therefore in the absence of a cost, resistance can only be overcome by the introduction of a new class of chemistry to which no preexisting mechanisms confer cross-resistance. Thus, just as in the search for new antibiotics, the need for new classes of insecticide remains paramount.

Meeting with Dave Malone & Graham Small, IVCC. see notebook notes.

Interesting points :

~ in new AIs they are aiming for high effectiveness from 95-100% ~ one exception is Clor.. which consistently has mortalities of 60-70%

8/9/17 Last week came across REX consortium paper from 20013 'Hetrogeneity of selection and the evolution of resistance'. It's a good review of strategies to avoid resistance.

Collates both modelling and empirical papers, for sequences, mixtures, rotations and mosaics.

Will require me to rewrite one paragraph of intro.

A recent comprehensive review of strategies to avoid resistance evolution across pesticides and drugs [@Consortium2013] concluded that mixtures (combination of molecules) are usually the best resistance management strategy.

Of 14 modelling studies comparing mixtures and sequences, 11 favour mixtures and in 3 results depend on other inputs. Of 10 empirical studies, 8 favour mixtures and in 2 the results are the same.

They suggest that mixtures are better in most models irrespective of the variation of other parameters.

The conclude that the advantage of mixtures is greatest when : i)* resistance is initially rare ii) independent loci (i.e. no cross resistance) iii) high recombination between loci iv) high mortality of homozygous susceptibles v)* resistance is recessive vi) similar persistence of insecticides vii) some of population remains untreated (lower exposure)

** consistent with our model (i.e. high effectiveness and low exposure favour mixtures)

our model did not indicate that these effected the difference between mixtures and sequences.

Couple of sentences added to intro : A recent comprehensive review of strategies to avoid resistance evolution across pesticides and drugs [@Consortium2013] concluded that mixtures (combination of molecules) are usually the best resistance management strategy. This was based on both empirical and modelling work. Modelling studies have investigated the evolution of insecticide resistance in insecticide mixtures including in a public health context e.g. [@Curtis1985][@Mani1985][@Roush1989] but much of the work was done more than 20 years ago and there remained some confusion about the results [@Levick2017].

eek seems I'va accidentally deleted most of discussion, get it back via github ah no remember I've copied into a word doc to allow Ian & myself to edit.

11/9/2017

~ finished edits to abstract & discussion in word ~ see test_linenumbers.Rmd for how to get to word doc

https://malariajournal.biomedcentral.com/submission-guidelines/preparing-your-manuscript/research-article

Title page list the full names, institutional addresses and email addresses for all authors indicate the corresponding author

Figures should be provided as separate files, not embedded in the main manuscript file.

Each figure of a manuscript should be submitted as a single file that fits on a single page in portrait format. Tables should NOT be submitted as figures but should be included in the main manuscript file.

Multi-panel figures (those with parts a, b, c, d etc.) should be submitted as a single composite file that contains all parts of the figure.

Figures should be numbered in the order they are first mentioned in the text, and uploaded in this order.

Figures should be uploaded in the correct orientation.

Figure titles (max 15 words) and legends (max 300 words) should be provided in the main manuscript, not in the graphic file.

Figure keys should be incorporated into the graphic, not into the legend of the figure.

Each figure should be closely cropped to minimize the amount of white space surrounding the illustration. Cropping figures improves accuracy when placing the figure in combination with other elements when the accepted manuscript is prepared for publication on our site. For more information on individual figure file formats, see our detailed instructions.

Individual figure files should not exceed 10 MB. If a suitable format is chosen, this file size is adequate for extremely high quality figures.

Figure formats : PDF (suitable for diagrams and/or images) Microsoft Word (suitable for diagrams and/or images, figures must be a single page) TIFF (suitable for images)

Are there recomendations for figure file nameing ?

done - Figure titles (max 15 words) and legends (max 300 words) should be provided in the main manuscript, not in the graphic file.

Tables should be in text because < A4.

Referencing : Smith JJ. The world of science. Am J Sci. 1999;36:234-5.

Article within a journal (no page numbers)

Rohrmann S, Overvad K, Bueno-de-Mesquita HB, Jakobsen MU, Egeberg R, Tjønneland A, et al. Meat consumption and mortality - results from the European Prospective Investigation into Cancer and Nutrition. BMC Med. 2013;11:63.

Article within a journal by DOI

Slifka MK, Whitton JL. Clinical implications of dysregulated cytokine production. Dig J Mol Med. 2000; doi:10.1007/s801090000086.

For software, this section should include:

Project name: e.g. My bioinformatics project Project home page: e.g. http://sourceforge.net/projects/mged Archived version: 10.5281/zenodo.889012 Operating system(s): e.g. Platform independent Programming language: e.g. Java Other requirements: e.g. Java 1.3.1 or higher, Tomcat 4.0 or higher License: e.g. GNU GPL, FreeBSD etc.

Software and code

Any previously unreported software application or custom code described in the manuscript should be available for testing by reviewers in a way that preserves their anonymity. The manuscript should include a description in the Availability of Data and Materials section of how the reviewers can access the unreported software application or custom code. This section should include a link to the most recent version of your software or code (e.g. GitHub or Sourceforge) as well as a link to the archived version referenced in the manuscript. The software or code should be archived in an appropriate repository with a DOI or other unique identifier. For software in GitHub, we recommend using Zenodo. If published, the software application/tool should be readily available to any scientist wishing to use it for non-commercial purposes, without restrictions (such as the need for a material transfer agreement).

So I need to use Zenodo to set up a link to an archived version of the code.

https://guides.github.com/activities/citable-code/ https://zenodo.org/account/settings/github/

Created a github release, now any github release will create a zenodo doi.

18/9/17 ~ fixed all refs

20/9/17 ~ at faculty forum, both Martin Donnelly and Dave Weetman suggested that resistance may be more likely to be polygenic.

Dave Weetman's points : ~ whilst in the past it may have taken single mutations of large effect to provide resistance, now it seems that there is such genetic diversity in mutations that multiple ones can be selected for to provide resistance. ~ new IVCC compounds have more complex binding to target sites so more likely that multiple genes will be required to disrupt this.

21/9/17 ~ look to add in paragraph about Roush to discussion.

Why does Roush opt for rotations and Rex for mixtures ?

Also remember that Roush(1989) said in abstract : 'The number of genes involved in resistance and the fitness disadvanatges they may confer in the absence of use appear to be of relatively little significance in choosing management tactics'.

How does he justify that number of genes is not important in choosing use strategy ?

Main message from Roush(1989) ~ choices about insecticide use strategies can be based on currently available modest amounts of information. ~ 'of the three possoble ways that two or more non-cross-resistant compounds can be used, mixtures, alternations, or mosaics, most sitauations will be best served by the alternations of pesticides across generations.' ~ concludes that single gene models are sufficient for evaluating use strategies (but has this been superceded since ?) because a single gene is usually responsible for control failures. ~

I cut this sentence and replaced with one talking about prelim ideas in Via 1986 : A quantitative genetic model would likely lead to a more rapid initial evolution of resistance, a slower middle phase and a continuing increase.

22/9/17 From last meeting IVCC talked about Clorphenapyr new AI mixed with Pyr in bednets. Always gets relatively low 60-70% effectiveness. We could look at implications of that ?

Ian asked me to cut this sentence about selection coefficients from the paper. saying "Its not obvious to me that this is true because it has to be scaled against the SS genotype." However I'm pretty sure it is true remembering the process we went through to get to it - we used to use selection coeff in Beths code. The 'selection coefficient' of resistance, that may be seen referred to elsewhere [@Curtis1985], can be calculated by multiplying resistance restoration by insecticide effectiveness.

25/9/2017 last push to submission

relooking at REX(2010) review of models paper : quantitative multiple gene resistance has not been the subject of any modelling approach by the 187 articles selected

A small number of models simulated quantitative resistance, recombination and cross-resistance between molecules. When more than one molecule was considered (35% of the models), the resistance mechanisms considered tended to be monogenic, independent and nonepistatic. This may be a reasonable assumption, because there is considerable evidence to suggest that resistance to pesticides and drugs mostly evolves through the selection of alleles with a major effect, and this view is supported by theoretical models (Roush and McKenzie 1987; Neve 2007). However, in some cases, resistance is clearly because of genes located on several chromosomes (Denholm and Rowland 1992) or has emerged from the addition of several mechanisms of small effect such as limited detoxification, sequestration and/or translocation (Park and Brown 2002), thus evolving as a quantitative genetic trait. The assumption that resistance is monogenic may thus reflect a reluctance to increase model complexity. Whatever the reason, quantitative multiple gene resistance has not been the subject of any modelling approach by the 187 articles selected. Furthermore, although multi-drug resistance is frequent and despite the fact that many pesticide programs use a combination of nonindependent chemicals, cross-resistance is seldom considered into the models.

registered for malaria journal w77.

http://www.editorialmanager.com/malj/default.aspx

27/9/2017

final edits to : paper2_resistance_mechanisms_mixtures_20170925_submission.docx

Todo to paper 2 MS doc post knitting before submission :

done ~ Ctrl A, Page layout, line nums, double space. done ~ Insert, page number. done ~ check that tables and other sections don't cross pages but do not use page breaks in your manuscript done ~ modify any table column widths that need done ~ delete figures

reading section on competing interests : https://www.biomedcentral.com/getpublished/editorial-policies#competing+interests

done ~ submission letter

Suggested Jo Lines & Matt Thomas & Denis Bourguet as referees.

2/10/2017 submitted paper2

Bits of text that didn't make it into final version : For those who distrust models, work in a moth and genetically modified brocolli system has shown that toxins used in combination can delay the onset of resistance in a way that is consistent with theoretical models [@Zhao2003].

Insecticide resistance becomes a problem when genes coding for resistance firstly arise in a population and secondly increase in frequency. We concern ourselves with this second process of how insecticide resistance increases in frequency within a population. It is likely that genes conferring resistance are present in populations even prior to exposure to novel insecticides thus leading to the potential for selection [@Liu2015]. The changing frequency of insecticide resistance is a population genetic process that can be influenced by (among others) the parameters outlined in Table 1.

The mechanisms involved in the response of resistance in mixtures are summarised in Table3. Whether a mixture or sequence is favoured for reducing the spread of resistance will depend on these mechanisms and is summarised in Table 4.

How do the rest of our results relate to recommendations that have been produced by WHO[@WHO2012], IRAC[@IRAC2011] and FAO[@FAO2012] ?

[@Birget2015] We show that indoor use of insecticides leads to less selection pressure than their use as larvicides. Reasons for relatively low selection pressure by adulticides (i) males are not affected by the ITNs (ii) insecticides are also repellents, keeping mosquitoes at bay from contacting the insecticide but also driving them to bite either people who do not use the insecticide or alternative hosts.

Relevance of these mechanisms in the field e.g. exposure can be reduced if either insecticide has a repellent effect, which can be set in formulation. [@Birget2015a] mention this not for resistance but for ITNs with repellency requiring higher coverage for elimination)

[@Kliot2012] fitness costs associated with insecticide resistance mini-review, not very good. No mention of the magnitude of costs.

Good review of tactics for IRM including mention of mixtures and new AIs [@Denholm1992] "Once developed and marketed, new products introduce a fresh challenge how to exploit their characteristics while restricting exposure to preserve their effectiveness."

(@) In a mixture each insecticide reduces the rate of increase in resistance to the other by killing individuals that are resistant to the other.

(@) In a mixture higher effectiveness of either insecticide kills more individuals resistant to the other and thus increases time-to-resistance for the other. Thus although increasing effectiveness of an insecticide decreases time-to-resistance when used alone, when used in a mixture time-to-resistance for both insecticides is increased.

(@) For a mixture of 2 insecticides with differing effectiveness. Resistance to the insecticide with a greater effectiveness increases faster. The less effective insecticide is 'protected' by the more effective. Resistance to the less effective insecticide increases slowly until resistance to the more effective insecticide reaches a high level.

The latter is as expected because it reduces the selective advantage of resistance.

For single insecticide use resistance responded identically to changing exposure and effectiveness (compare Fig 3A to 3B). This makes sense as vector kill is effectively a product of exposure times effectiveness. For example exposing 50% of a population to an insecticide which is 75% effective would be expected to have the same result as exposing 75% of the population to one which is 50% effective. This observation points to the mechanism by which increasing both exposure and effectiveness lead to a faster increase in resistance. In both cases the increased deaths of susceptible vectors cause a higher selection pressure that can explain the faster increase.

The pattern of more rapid increase in resistance at higher dominance of restoration levels (Fig 3C) can be explained by increased survival of heterozygotes in the presence of the insecticide. Higher dominance of the resistant allele causes it to contribute more to the phenotype of the heterozygotes leading to higher survival. Thus selection pressure for the resistance allele will be increased because it confers more of an advantage when only present on one chromosome. The faster development of resistance under higher resistance-restoration (Fig 3D) can be explained by it's effect on the survival of the resistant genotypes. Resistance-restoration restores the survival of resistant genotypes in the presence of the insecticide back towards what it would be in the absence of the insecticide, thus increasing the selective advantage of the resistance allele. The effect of the starting frequency of resistance (Fig 4) is the most different from the other inputs. Simply, when starting from a higher frequency of resistance there is a smaller change to make to reach the higher resistance thresholds.

[@Roush1989] Predicting the time to resistance or accurately simulating its development requires very detailed information on initial gene frequencies, genotypic fitnesses, and many other factors that are difficult to measure. However, choosing between options requires far less information. The primary purposes of this paper are to demonstrate that choosing the correct resistance management tactics can be a straightforward process and to suggest how such choices can be made on the basis of current information, or when current information is inadequate, how the necessary data might be gathered. In particular, it will be questioned whether species-specific simulation models and information on the number of genes controlling resistance or fitness disadvantages caused by resistance are necessary.

~ potential strategies all need to work by managing selection pressure. ~~ selection pressure is created by more resistants surviving than susceptibles ~~ therefore can either : ~~~ increase resistants killed (e.g. as in mixtures) ~~~ reduce susceptibles killed (e.g. as in rotation)

Modelling and theory have been used to generalise Insecticide Resistance Management into three potential strategies [@Georghiou1994]. These are A) moderation : preserving susceptible genes by limiting selection pressure, B) saturation : high dose so that heterozygous resistants are killed and C) multiple attack : independently acting pressures neither of which is strong enough to lead to resistance. Our work here principally focuses on multiple attack but the other strategies are relevant too. todo expand this paragraph_

~ Ian said that he was unsure of the multiple intragenerational killing argument, because the 2nd insecticide would kill both S & R of the first & therefore wouldn't exert any selective pressure to reduce resistance to the first (i think that's what he said ...)

time 2015-16

june : Hours 127:45:00 Hours rounded 127.50 Days (/7.5) 17.00 Pay (217 per day) 3689 Prev days remaining 47.75 Days remaining (from 85) 30.75

claimed June 10th : april, may : Days (/7.5) 12.00 Prev days remaining 59.75 Days remaining (from 85) 47.75

claim Aug 31 Days (/7.5) 22.00 Pay (217 per day) 4774 Prev days remaining 47.75 Days remaining (from 85) 25.75

193.125 to use up

Sep - Nov 25 2016 168/7.5 = 22.4 add some of Bayer time in to use up.

~~ for linkage disequilibrium it will be cool to be able to put the gpirm LD figures (from plot_ld_gpirm) on a common scale for mixtures & sequences

~ reaqainting myself with resistance code

~ thinking about SSI code review, I had a quick look, current call is not open, there is a self-evaluation but it is mostly focused on identifying whether the software is good for users. I would like to know how best to modify the software to make it more useable by us and potentially others in future.

If we wanted to get it reviewed I would probably first need to improve the 'getting started' and 'explaining what it does' documentation first.

Came across paper citing Levick paper : Huijben2017

A greater understanding of the general evolutionary principles that are at the core of emerging resistance are urgently needed if we are to develop improved resistance management strategies with the ultimate goal to achieve a malaria-free world.

In this review we discuss the evolutionary consequences of the way we currently implement antimalarial interventions and how evolutionary principles can be applied to extend the lifespan of current and novel interventions. The emergence and spread of resistant parasites and mosquitoes is a result of simple Darwinian principles of fitness costs/benefits in the presence/absence of the drug or insecticide. When the failure of malaria interventions is seen as an evolutionary process, i.e. the outcome of the competitive interactions between wildtype (susceptible) and mutant (resistant) organisms, resistance management strategies can be designed to minimize the fitness of mutants, hence slowing down the spread of resistance. ... In other words, mutant parasites and mosquitoes only spread when there is sufficient chemical pressure and will reduce in frequency in the absence of this pressure if the associated fitness cost is sufficiently high. As this cost of resistance is the Achilles heel of resistant parasite and mosquito populations, for effective resistance management strategies we need to create an environmental context where the benefit of resistance is low and the fitness cost high. ... While it is obvious that mono-application of insecticides will pave the path for rapid evolution of resistance, it is not at all clear which of the above resistance management strategies should be deployed to maximally deter resistance evolution. What is even less clear is how to determine the optimal deployment conditions (REX Consortium, 2013): How frequently should insecticides be rotated? What spatial scale is needed for mosaic application? The answer to these questions lies in understanding the ecology and evolution of resistant mosquitoes: what are the fitness costs/benefits in the presence/absence of insecticide exposure and what are the patterns of mosquito gene flow? These parameters are likely to differ between (classes of) insecticides, different environments and different vector species, meaning that one size does not fit all when it comes to resistance management. But the key requirement for any strategy to work is to select costly resistance mutations that have a selective disadvantage in absence the insecticide.

Mathematical models will be of great value to predict the efficacy of the various resistance management strategies, especially as large-scale field trials whereby different strategies are evaluated will be expensive and require years of testing, a luxury we do not have. A recent model predicts that insecticide mixtures are a favoured resistance management strategy when insecticide effectiveness is high and insecticide exposure low. If the insecticides do not reliably kill sensitive vectors, sequential deployment appears to be a more robust strategy (Levick, South, & Hastings, 2017).

A greater understanding of the general evolutionary principles that are at the core of emerging resistance are urgently needed, and will allow us to develop improved resistance management strategies for both drugs and insecticides, as has been argued in various other disciplines, including herbicide (Neve, 2007) and cancer resistance management (Enriquez-Navas, Wojtkowiak, & Gatenby, 2015). This involves a greater investment in modelling evolutionary dynamics and performing lab and (semi)field trials to develop and test novel resistance management strategies (which include techniques such as mosaics, rotations, mixtures, refugees, and/or the combination of various (non)chemical interventions).

An additional obstacle is that when it comes to evolutionary principles, there is a general disconnect between the academic world and policy makers ...

Therefore we should think ahead, assume evolution will happen, and wear our evolutionary hat at all times when rolling out existing and new interventions.

Refs that look good : Ocampo, D., & Booth, M. (2016). The application of evolutionary medicine principles for sustainable malaria control: a scoping study. Malaria Journal, 15, 383. https://doi.org/10.1186/s12936-016-1446-8

Volkman, S. K., Herman, J., Lukens, A. K., & Hartl, D. L. (2017). Genome-wide association studies of drug-resistance determinants. Trends in Parasitology, 33(3), 214–230. https://doi.org/10.1016/j.pt.2016.10.001

Sternberg 2017 Insights from agriculture for the management of insecticide resistance in disease vectors.

Moreover, we emphasize that the success of insecticide resistance management strategies is strongly dependent on the biological specifics of each system. We suggest that the biological, operational, and regulatory differences between agriculture and public health limit the wholesale transfer of knowledge and practices from one system to the other. Nonetheless, there are some valuable insights from agriculture that could assist in advancing the existing Global Plan for Insecticide Resistance Management framework.

Ooo a newly accepted paper modelling IR strategies in Evolutionary Applications :

Sudo M, Takahashi D, Andow DA, Suzuki Y, Yamanaka T (2017) Optimal management strategy of insecticide resistance under various insect life histories: heterogeneous timing of selection and inter-patch dispersal. Evolutionary Applications, online in advance of print. http://dx.doi.org/10.1111/eva.12550

Accepted manuscript online: 10 September 2017 DOI: 10.1111/eva.12550

http://www.sudori.info/english.html

The simulations demonstrated the optimality of the mixture strategy either when insecticide efficacy was incomplete or when some part of the population disperses between patches before mating. The rotation strategy, which uses one insecticide on one pest generation and a different one on the next, did not differ from sequential usage in the time to resistance, except when dominance was low. It was the optimal strategy when insecticide efficacy was high and pre-mating selection and dispersal occur.

With this plethora of possible management strategies, a theory that integrates them would be beneficial for all stakeholders, including theoretical and empirical ecologists, field practitioners and farmers. Such a theory could allow ready comparison of the effectiveness of alternative strategies, facilitating the decision-making and the development of policy. With the acceleration of the rate of resistance evolution and the withdrawal of toxins, and the deceleration of discovery of new toxins (Bielza et al., 2008), it is urgent to improve resistance management and to know when and how we should use each strategy. Particularly we need to evaluate the relative advantage among the strategies for use of multiple toxins, i.e., rotation or mixture, to prolong their life spans.

Recently the REX Consortium reviewed 16 published theoretical papers and found that the mixture strategy was superior to the rotation strategy in 14 cases, with one case the opposite and another indeterminant (REX Consortium, 2013). On the other hand, the majority of empirical researchers is still skeptical about the mixture strategy (IRAC, 2012). This is partly because rotation intuitively sounds better than mixture; mixture intensifies the selection pressure while rotation relaxes it (Denholm & Rowland, 1992). We suspect that empirical researchers have a fundamental distrust of the simple assumptions in many theoretical models. ... Both delaying resistance evolution and suppressing pest populations are essential, while many models merely report the time to resistance (but see Peck & Ellner, 1997; Peck et al., 1999). ... Though our model assumes more realistic life histories and insecticide applications, we made its structure minimally simple as the basic formulation in the Comins' model by having two-patches (Comins, 1977) ... In this study, we apply the insecticides during the juvenile and/or adult stages in three different management strategies, sequential, rotation, and mixture. ... In the general model, populations pass through eight events, sequentially, (I) juvenile selection, (II) densitydependent survival, (III) pre-mating dispersal, (IV) pre-mating selection, (V) mating, (VI) post-mating dispersal, (VII) post-mating selection, and (VIII) oviposition (Fig. 1). ...

All the simulations were executed using R! R source code used is from the Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.4gv44

It's written in Rcpp and is a little greek to me, source functions are here. I might be able to load package & call them without having to tinker with code. C:\Dropbox\resistance LSTM\Sudo2017\ResistanceDLDP\src\DLDP_source.cpp ... The performance of the rotation strategy is always equal to or better than the sequential use of each insecticide, but it is optimal only under a few conditions. When systemic insecticides are used (s ≅ 0), the rotation strategy performs better than the mixture strategy when there is pre-mating dispersal combined with the pre-mating adult selection (Fig. 2, rows 2 and 4, columns 1, 2, 5, and 6). This advantage stems from the alternating selection in the rotation, which allows susceptible alleles of the unselected toxin to increase in alternating generations in the treated patch. ... It has been a longstanding debate about whether the rotation or mixture strategy will be most effective for pesticide resistance management. Our results support the robustness of the mixture strategy for non-high-dose insecticides when insecticide application is imperfect or a refuge supplies susceptible insects to the mating pool and the initial resistance frequency is low. We showed this superiority using a mechanistic model applied to a variety of diploid insect life histories. Although the REX Consortium reported a similar conclusion, they merely counted the number of theoretical studies that compared the rotation and mixture strategies (REX Consortium, 2013). We found that the mixture strategy shares the same mechanism as the HDR strategy to delay resistance evolution as suggested by Ives et al. (2011). ...

Should I submit Malaria J. article to preprint ? https://en.wikipedia.org/wiki/Manuscript_(publishing)

In January 2017, the Medical Research Council announced that they will now be actively supporting preprints with effect from April 2017.[8] Also in January 2017, Wellcome Trust stated that they will now accept preprints in grant applications.[9]

PeerJ PrePrints is a free preprint server operated by PeerJ. Articles submitted undergo a basic screening process but are not peer-reviewed. Commenting is allowed by any registered user, and download and pageview data are supplied. All articles are published with a CC-BY license. As of September 2016, 2,439 articles have been made available.[20] Zenodo is a repository for research data that has been used also as preprint repository, because offers document preview and a DOI number for the submitted document.

The e-print archive arXiv (pronounced "archive") is one of the best-known preprint servers. It was created by Paul Ginsparg in 1991 at Los Alamos National Laboratory for the purpose of distributing theoretical high-energy physics preprints.[25] In 2001, arXiv.org moved to Cornell University and now encompasses the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics.

4/10/17 ~ reviewing resistance code and thinking how it could be extended to 3+ insecticides

~ can I design more flexible arrays ? ~ e.g. just this simple example, how could it be extended to 3+ niches a_nichetog <- array_named( niche1=c('0','a','A'), niche2=c('0','b','B') )

static way array_named( niche1=c('0','a','A'), niche2=c('0','b','B'), niche3=c('0','c','C') )

This sorts the problem with the names of the dimnames, but would still be tricky to get the c('0','c','C') in the example above.

array_named_flex <- function(namesdimnames, ...) { a_ <- array(0, dim = lengths(list(...)), dimnames = list(...)) names(dimnames(a_)) <- namesdimnames return(a_) }

Better to just be able to build up the list of args for the array_named function. Probably just put in a list & use do.call ?

l_s <- list(sex=c("f","m")) do.call(array_named,l_s)

nearly there but how to name the components of the list from a vector ? names(l_s) <- "a1"

l_s2 <- list(c("f","m"),c(1,2)) #note dimensions can be named with numbers too names(l_s2) <- c("a1","a2") do.call(array_named,l_s2)

Cool getting there !! One solution is to create an un-named list of the dimnames first. Then to name these dimnames from a vector (which can be flexibly created in the code) Then to call do.call(array_named,l_s2)

so for the example above array_named( niche1=c('0','a','A'), niche2=c('0','b','B'), niche3=c('0','c','C') ) instead : l_s2 <- list(c('0','a','A'), c('0','b','B'), c('0','c','C')) names(l_s2) <- c("niche1","niche2","niche3") do.call(array_named,l_s2)

Actually I probably wouldn't want to use the ABC notation because it is reduntant, the position in niche1 2 or 3 is indicated by the position in the array

So it could be : num_insecticides <- 4

list(c()) is important below to create one list for each dim

l_dimnames <- rep(list(c('no','lo','hi')), num_insecticides) names(l_dimnames) <- paste0("niche", 1:num_insecticides) a_4 <- do.call(array_named,l_dimnames)

Hurrah ! this works.

But I should just think for a minute. Why am I using arrays ? I think it's to make the indexing transparent and to make it easy to sum across dimensions. But arrays are not common modern R practice. Could we do this with dataframes to be more conventional and take advantage of more recent R tools ?

A dataframe for this could look like :

insecticide niche toggle 1 no 1 1 lo 0 1 hi 1 2 no 1 2 .. .. .. 3

For other arrays

a_fitnic <- array_named( locus1 = c('SS1','RS1','RR1'), locus2 = c('SS2','RS2','RR2'), niche1=c('0','a','A'), niche2=c('0','b','B') )

Does this capture all of the dimensions ? insecticide niche haplotype fitness 1 no SS 1 1 lo RS 0.5 1 hi RR 1 2 no SS 1 2 .. .. .. 3

Aha! this can reshape an array, and responseName determines the name of the column values are put into

a_fitnic <- fitnessNiche()

df_fitnic <- as.data.frame.table(a_fitnic, responseName = 'fitness', stringsAsFactors = FALSE)

locus1 locus2 niche1 niche2 fitness 1 SS1 SS2 0 0 1.000000 2 RS1 SS2 0 0 1.000000 3 RR1 SS2 0 0 1.000000 4 SS1 RS2 0 0 1.000000 5 RS1 RS2 0 0 1.000000 6 RR1 RS2 0 0 1.000000 7 SS1 RR2 0 0 1.000000 8 RS1 RR2 0 0 1.000000 9 RR1 RR2 0 0 1.000000 10 SS1 SS2 a 0 1.000000 11 RS1 SS2 a 0 1.000000 12 RR1 SS2 a 0 1.000000 13 SS1 RS2 a 0 1.000000 14 RS1 RS2 a 0 1.000000 15 RR1 RS2 a 0 1.000000 16 SS1 RR2 a 0 1.000000 17 RS1 RR2 a 0 1.000000 18 RR1 RR2 a 0 1.000000 19 SS1 SS2 A 0 0.500000 20 RS1 SS2 A 0 0.625000 21 RR1 SS2 A 0 0.750000 22 SS1 RS2 A 0 0.500000 23 RS1 RS2 A 0 0.625000 24 RR1 RS2 A 0 0.750000 25 SS1 RR2 A 0 0.500000 26 RS1 RR2 A 0 0.625000 27 RR1 RR2 A 0 0.750000 28 SS1 SS2 0 b 1.000000 29 RS1 SS2 0 b 1.000000 30 RR1 SS2 0 b 1.000000 31 SS1 RS2 0 b 1.000000 32 RS1 RS2 0 b 1.000000 33 RR1 RS2 0 b 1.000000 34 SS1 RR2 0 b 1.000000 35 RS1 RR2 0 b 1.000000 36 RR1 RR2 0 b 1.000000 37 SS1 SS2 a b 1.000000 38 RS1 SS2 a b 1.000000 39 RR1 SS2 a b 1.000000 40 SS1 RS2 a b 1.000000 41 RS1 RS2 a b 1.000000 42 RR1 RS2 a b 1.000000 43 SS1 RR2 a b 1.000000 44 RS1 RR2 a b 1.000000 45 RR1 RR2 a b 1.000000 46 SS1 SS2 A b 0.500000 47 RS1 SS2 A b 0.625000 48 RR1 SS2 A b 0.750000 49 SS1 RS2 A b 0.500000 50 RS1 RS2 A b 0.625000 51 RR1 RS2 A b 0.750000 52 SS1 RR2 A b 0.500000 53 RS1 RR2 A b 0.625000 54 RR1 RR2 A b 0.750000 55 SS1 SS2 0 B 0.500000 56 RS1 SS2 0 B 0.500000 57 RR1 SS2 0 B 0.500000 58 SS1 RS2 0 B 0.625000 59 RS1 RS2 0 B 0.625000 60 RR1 RS2 0 B 0.625000 61 SS1 RR2 0 B 0.750000 62 RS1 RR2 0 B 0.750000 63 RR1 RR2 0 B 0.750000 64 SS1 SS2 a B 0.500000 65 RS1 SS2 a B 0.500000 66 RR1 SS2 a B 0.500000 67 SS1 RS2 a B 0.625000 68 RS1 RS2 a B 0.625000 69 RR1 RS2 a B 0.625000 70 SS1 RR2 a B 0.750000 71 RS1 RR2 a B 0.750000 72 RR1 RR2 a B 0.750000 73 SS1 SS2 A B 0.250000 74 RS1 SS2 A B 0.312500 75 RR1 SS2 A B 0.375000 76 SS1 RS2 A B 0.312500 77 RS1 RS2 A B 0.390625 78 RR1 RS2 A B 0.468750 79 SS1 RR2 A B 0.375000 80 RS1 RR2 A B 0.468750 81 RR1 RR2 A B 0.562500

Can then filter using dplyr : library(dplyr) filter(df_fitnic, locus1=='RR1')

But I don't think this is as compact as using the array dimension names for referencing.

BUT then again a dataframe approach might be clearer for some things. I need to look into more.

Starting with one of the earlier arrays, how could it cope with flexible number of insecticides ?

a_fitloc works in a slightly different way and I think would be easier to make flexible.

a_fitloc <- array_named( loci=c('SS1','RS1','RR1','SS2','RS2','RR2'), exposure=c('no','lo','hi') )

num_insecticides <- 4 loci <- paste0(c('SS','RS','RR'),rep(1:num_insecticides,each=3)) a_fitloc <- array_named( loci=loci, exposure=c('no','lo','hi') )

OR as in rotations a_fitloc<- array_named(insecticide=1:n_insecticides, genotype=c('SS','RS', 'RR'), exposure=c('no','lo', 'hi'))

11/12/2017 addressing referees comments (saved in word doc)

First looking at linkage disequilibrium. It might be good to include a plot of LD.

Perhaps just for mixture ? not sure that I can for sequence ? listOut <- resistSimple() plotlinkage(listOut$results[[1]]) plotlinkage(listOut$results[[1]], plot_dprime = FALSE) # to show just d

This is the scenario I want to look at.

12/2017 after paper review : fig 8D in paper2 with a new & old insecticide, mix slower

listOut <- runcurtis_f2( recomb_rate=0.5, max_gen=500, P_1 = 0.001 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.8 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , addCombinedStrategy = FALSE, strategyLabels = c('s','','a','m'), ylab="", ylabs = FALSE, cex.axis = 0.8, addLegend=FALSE, main='', maxX = 150, labelMixSeqRatio = 1 )

in listOut from runcurtis_f2, 1=I1, 2=I2, 3=mix

I want to compare the new insecticide on it's own listOut$results[[1]] i think, and in a mixture listOut$results[[3]]

plotlinkage(listOut$results[[1]], plot_dprime = FALSE) plotlinkage(listOut$results[[3]], plot_dprime = FALSE)

these seem to show negative LD ?

we are mostly interested in what happens in the first 80 generations, by that time the frequency of I1 has reached 0.5 both alone and in the mixture.

par(mfrow=c(3,1)) plotlinkage(listOut$results[[1]], plot_d = TRUE, max_gen_plot = 80, main = "insecticide1 alone") plotlinkage(listOut$results[[2]], plot_d = TRUE, max_gen_plot = 80, main = "insecticide2 alone") plotlinkage(listOut$results[[3]], plot_d = TRUE, max_gen_plot = 80, main = "mixture")

It's not obvious to me what this shows. Seems that I1 ld negative, I2 ld positive, mix ld stays around 0.

check that insecticide2 is the old one ? hch is red, amat, & 3rd matrix passed. so is insecticide1.

Good papers in PLOS on malaria elimination :

combination interventions and modelling : http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1002453 "The interface between modelling and implementation has not developed as was perhaps envisaged, in terms of appropriate portals to allow 'end users' access to relevant software and explore the effect of varying conditions on the ideal choice of control measures.""

resistance : malERA2017_updated-research-agenda-resistance-in-malaria.pdf

"What are the benefits of insecticide rotations, mixtures, or spatial mosaics of different compounds? What is the impact of adding nonpyrethroid IRS where LLINs are already deployed at high coverage and quality? When should new insecticides be adopted? What is the ideal rotation period or mosaic configuration? How many insecticide classes are needed for effective rotation or mosaic strategies? Despite the absence of data to answer these questions, some countries have already developed operational frameworks for resistance management that could be adopted by other programmes.""

The malERA Refresh Consultative Panel on Insecticide and Drug Resistance (2017) malERA: An updated research agenda for insecticide and drug resistance in malaria elimination and eradication. PLoS Med 14(11): e1002450. https://doi.org/10.1371/journal.pmed.1002450

12/12/17

revisiting question of poor implementation following reviewers comment.

Nick Hamon, IVCC

Justin

http://www.fao.org/fileadmin/templates/agphome/documents/Pests_Pesticides/Code/FAO_RMG_Sept_12.pdf See page 12.

From Guedes 2017 Sublethal exposure, insecticide resistance, and community stress Doesn't seem to point to any concrete data that poor implementation promotes resistance.

The role and potential consequences of sublethal insecticide exposure for insecticide resistance are frequently neglected, but their relevance may be recognized on three fronts. First, sublethal exposure may delay selection for major single gene resistance while favoring multifac- torial or polygenic resistance [14]. This is the likely consequence of the accumulation of low-level resistance genes and mechanisms (e.g., reduced penetration, behavioral avoidance, etc.) allowing small increases in the magnitude of insecticide resistance distinct from the selection of a major mutation (e.g., altered target site sensitivity) leading to a high resistance [14,15]. Furthermore, sublethal stress may also contribute to resistance by promoting increased mutation rates of genes involved in DNA repair, as observed in bacteria and weeds [14,16,17].

GOOD PAPER with some good references 14. Gressel J: Low pesticide rates may hasten the evolution of resistance by increasing mutation frequencies. Pest Manage. Sci. 2011, 67:253-257.

The early models of evolution of insecticide,1 fungicide2 andherbicide3,4 resistance were based on the evolution of resistant populations from a fixed low mutation frequency while applying the high, overkill doses then used, which led to major-gene target site resistance. Those models suggested that lower pesticide rates, with their lower selection pressure, might delay the evolution of target site resistance. Indeed, field epidemiology did demonstrate that, within a chemical class, the compounds with less persistence, or used at lower rates did not engender the evolution of pesticide resistance as quickly as the higher selection pressure conditions, where target site resistance did evolve. Whereas the lower rates delayed the evolution of a major gene, target site resistance, a new type of resistance, began appearing when very low rates were used, often lower than the lowest recommended rates; a slow shift of whole populations to slightly higher mean resistance with each subsequent generation.5 This is probably caused by a slow accumulation of small resistance factors.

1 is georghiou 1977

The only way to stop this slow accumulation of small resistance factors is to hit the population of pests with a higher dose.6 Thus, there is a conundrum with a no-win situation: low rates delay the evolution of major single gene resistance, and high rates preclude the accumulation of the multiplicity of factors leading to multifactorial resistance, whereas high rates engender the evolution of major gene resistance, and low rates rapidly select for multifactorial resistance.7

6 Gardner SN,Gressel JandMangel M,A revolving dose strategy to delay the evolution of both quantitative vs.major monogene resistances to pesticides and drugs. Internat J PestManag 44:161–180 (1998).

this gressel paper has an updated 2017 version, see lower. 7 Gressel J, Catch 22 –mutually exclusive strategies for delaying/preventing polygenically vs. monogenically inherited resistances, in Options 2000, ed. by Ragsdale N. American Chemical Society, Washington, DC, pp. 330–349 (1995).

From Shea 1999 : Strategies to Delay the Evolution of Resistance in Pests: Dose Rotations and Induced Plant Defenses.

The genetic mechanism responsible for resistance implicates the dosage strategy most likely to delay resistance. Resistance conferred by many incremental changes, each with a small effect, whether by many different (poly)genes, by gene amplification, or by sequential mutations within a gene, each increasing resistance (hereafter grouped under the term "quantitative resistance") appears gradually after selection with low or incrementally increasing doses (Via, 1986, Shaw, 1989, Caretto et al. 1994). In contrast, high doses applied from the start prevent quantitative resistance from accumulating because of the improbability that a large number of resistant alleles required for resistance to the high dose will initially be found in a single individual. Major gene resistance, however, conferred by a single gene having a large effect rises exponentially at a rate that depends on the dosage (i.e. selection pressure) applied. Monogenically-resistant populations usually appear to burst forth suddenly after a number of successive exposures to high doses, as predicted in models (Georghiou & Taylor 1977; Gressel & Segel, 1978). Some organisms have evolved resistance by both mono- and quantitative genetic mechanisms (Raymond, et al. 1989; Devonshire & Field, 1991; Lande, 1983; Crow, 1957; Galun & Khush, 1980; Putwain et al., 1982) and some organisms have evolved one or the other due to different pesticide or drug regimes (Gressel, 1995a; McKenzie et al., 1992; Putwain et al., 1982). Major monogene resistance had been the prevalent cause of pesticide resistance until farmers started reducing doses (Gressel, 1995b; Gressel et al., 1996), a problem likely to escalate as a result of current environmental and economic pressures to reduce pesticide use. Ideally, one would opt to rotate crops or pesticides, and we advocate such rotations as better all around management strategies. However, some crops cannot be easily rotated (e.g. wheat due to extreme climate or soil conditions, or orchards) and no alternative pesticides exist to which there is no cross-resistance. In these situations, we argue for a preventive tactic of rotating low and intermediate doses to delay the appearance of resistance (Gressel et al., 1996). This strategy delays both types of resistance longer than continuously applying either a series of only low doses or a series of only high doses. The key notion is that when quantitative resistance begins to build up after the series of low doses, the intermediate dose eliminates individuals with quantitative resistance before they accumulate enough of the small genetic changes to resist that intermediate dose. This resets quantitative resistance back to near its original low level, as the only survivors are the rare individuals with monogenic resistance. Monogenic resistance emerges more slowly with the moderate selection pressure of rotating doses than if exclusively intermediate or high doses were applied. The size of the pest population also affects the probability that there will be individuals with sufficiently high quantitative resistance to survive the intermediate dosage, so even low doses must be adequately high to keep pest populations in check. We developed a model to mathematically test this proposal, incorporating quantitative evolution, major gene evolution, and pest population dynamics, which was described in detail elsewhere (Gardner et al., 1998).

Duke 2017 Pesticide dose a parameter with many implications. (not very useful) The dose used can influence the mechanism of evolved resistance to the pesticide, with high doses favoring target site resistance and low doses favoring other mechanisms.

Lower than toxic doses of toxicants can result in stimulation of growth and other pest parameters such as fecundity or longevity, a phenomenon known as hormesis (40).

Gressel 2017 Catch 22: All Doses Select for Resistance. When Will This Happen and How To Slow Evolution?

My summary of Gressel 2017 : In agriculture a switch to lower pesticide doses, driven largely by economics, led to progressive selection for quantitative resistance caused by accumulation of numerous mutations of small effect. When doses had been higher these mutations of small effect were insufficient to give higher survival and did not accumulate.

The evolution of quantitative resistance under low doses was promoted by relatively high pest population giving evolution more material to work on.

Control of weeds in wheat crops in Australia using low doses of herbicides prompted evolution of polygenic metabolic resistance, whereas control of the same weeds in the US using high doses led to the evolution of monogenic target-site resistance.

Similarly, low insecticide doses led to polygenic resistance in the Australian sheep blowfly, whereas a higher dose led to the evolution of monogenic resistance (McKenzie, 1994).

There is also evidence that sublethal doses can increase mutation rates thus increasing the potential for the evolution of polygenic resistance [Gressel2017][Gressel2011].

[Curtis1998] on poor implementation effects : "If there is any truth in the general belief that low doses select for resistance, this could be due to a failure of low doses to kill resistance heterozygotes."

[Gardner1998][Gardner1999] develop a mathematical model of a revolving dose strategy. Because low doses prompt evolution of polygenic resistance, and high doses prompt monogenic resistance they suggest using low doses for a few generations until quantitative resistance builds up and then applying a high dose that will be sufficient to kill the partially resistant individuals but because only used occasionally will not lead to an increase in the frequency of monogenic resistance.

The model is unusual in that it represents both monogenic and polygenic resistance.

This has yet to be applied in the field [Gressel2017]

[Gardner1999] quote : "Ideally, one would opt to rotate crops or pesticides, and we advocate such rotations as better all around management strategies. However, some crops cannot be easily rotated (e.g. wheat due to extreme climate or soil conditions, or orchards) and no alternative pesticides exist to which there is no cross-resistance. In these situations, we argue for a preventive tactic of rotating low and intermediate doses to delay the appearance of resistance (Gressel et al., 1996). This strategy delays both types of resistance longer than continuously applying either a series of only low doses or a series of only high doses. The key notion is that when quantitative resistance begins to build up after the series of low doses, the intermediate dose eliminates individuals with quantitative resistance before they accumulate enough of the small genetic changes to resist that intermediate dose. This resets quantitative resistance back to near its original low level, as the only survivors are the rare individuals with monogenic resistance. Monogenic resistance emerges more slowly with the moderate selection pressure of rotating doses than if exclusively intermediate or high doses were applied."

This describes the model from 1999 (more details in 98 but text not copy-pasteable) In the model, after the pesticide is applied, resistant individuals survive and mate, producing offspring with a Gaussian distribution of quantitative resistance. After each generation of toxin application, the mean quantitative resistance shifts to a higher level that depends on the dose of the pesticide applied. We model the evolution of quantitative resistance according to the standard methods of population genetics (Falconer,1989). The major gene conferring resistance (the allele R) also increases in frequency at a rate determined by the relative fitnesses of the genotypes RR, rR and rr under a given toxin dose according to standard Mendelian genetic models (Crow, 1986). Individuals may survive the toxin as a result of either quantitative or major monogene resistance, or both. Alternatively, they may survive because they receive a dose of toxin lower than that intended, due to a number of possible factors. For example, field sprays cannot be applied as evenly as in the laboratory, some individuals may be sheltered by leaves or clods of dirt, and pests of different ages may have different levels of natural resistance.

It says to email Gardner for a copy of the code in true basic ... but she died a couple of years ago.

~ downloaded this for possible help estimating model parameters : Shi M and Renton M, Numerical algorithms for estimation and calculation of parameters in modelling pest population dynamics and evolution of resistance. Math Biosci 233(2):77–89 (2011).

Glunt2013 thesis UNDERSTANDING THE CONSEQUENCES OF SUB-LETHAL INSECTICIDE CONCENTRATIONS FOR INSECTICIDE RESISTANCE MANAGEMENT AND MALARIA CONTROL not very useful for us i think

I found this text about costs in a doc in Beths old folder : Although the persistence and high prevalence of the DDTR gene suggest that it does not carry a fitness cost (Abdalla, 2008)(Daborn 2002), laboratory studies have demonstrated that there is a noticeable reduction in fecundity and negative alteration of behaviour in Anopheles mosquitoes carrying the Dieldrin resistance gene (Rowland 1991(1), Rowland 1991(2)).

As it is unlikely that there is a fitness cost associated with possession of the DDTR allele in the absence of insecticide, this was omitted from our version of the model as in the Curtis model. However, the considerable fitness costs associated with dieldrin resistance genes mean that this could make a significant impact in the growth of the R allele in the population, especially when the lower exposure of males to the insecticide is considered.

Laboratory studies indicate that homozygote resistant mosquitos are 22% less fecund than their susceptible counterparts (Rowlings 1991), with additional fitness costs created by behavioural change caused by the nervous inhibition in the resistant phenotype. From the evidence presented in both Rowlings papers of the fecundity and behavioural costs, we estimated fitness cost of the resistant form in absence of insecticide to be around 40% (-0.4).

looking to find model params to be able to run the model (it would be good to have even just one run where we use field derived parameters)

looking at supplementary material from Rex consortium for empirical papers comparing mixtures and sequences : Of the 10 empirical papers using insecticides none of these use malaria vectors. Just one : Georghiou(1983) Culex quinquefasciatus (a vector of avian malaria and arboviruses)

potential sources of data Curtis1985 DDT & HCH (lindane an organochlorine ) Curtis1989 from Kasumba lambda. RS & SS but not RR ? Guillet2001 combined nets, mortality & deterrency, not by genotype Kolaczinski2000 pyr & non pyr mortalities for RR/RS/SS KDR Table3 & 4 Asidi2004 Asidi2005 some genotype data for Ace1 but vlow numbers of RR Chandre2000 pyr on kdr RR/RS/SS diff doses Assogba2014 2 genes together, pyr and ddt, Table 3 %KD Edi2012 pyr & ddt on kdr genotypes, some low sample sizes *Essandoh2013 carbamate & fenitrothion(op), see in supp. material

see survival_from_genotype_livedead.xls ?

maybe I can produce a table of resultant eff/rr_/dom even if not for this

Can I combine data from Curtis1985 & Kolaczinski2000 ? ooo does Kolaczinski2000 give any info on cost ? kind of but alas RS mort in unexposed is lower than SS. But combining pyr & ddt is not good because of cross-resistance

In Levick SI we show how we derive inputs (but the exact algebra is not given). Curtis1985 (in legend to Fig. 2) fitness in presence of insecticide DDT SS 0.27 RS 0.31 RR 0.5 HCH SS 0 RS 0.0007 RR 0.43

check how I get effectiveness, dominance and resistance restoration from this in paper1 code. in createInputMatrix() callibration = 1012 "B (HCH)", "A (DDT)" Effectiveness (1-fitness of SS in insecticide) input[27,1] <- 0.73 #Phi SS1 in A input[29,1] <- 1 #Phi SS2 in B Exposure 0.9 Selection coefficient (fitness RR-SS in insecticide) input[39,1] <- 0.23 #in A input[41,1] <- 0.43 #in B Resistance restoration = selection coefficient / effectiveness A(DDT) 0.23/0.73 = 0.315 B(HCH) 0.43/1 = 0.43

Dominance of restoration, fitnesses in insecticide (RS-SS)/(RR-SS) A(DDT) (0.31-0.27)/(0.5-0.27) = 0.17 B(HCH) (0.0007-0)/(0.43-0) = 0.0016

I had a little panic that HCH value should be 0.16 rather than 0.0016 from Levick et al. Because I copied 0.07% as 0.07 when should be 0.0007.

If I set dominance of restoration to 0.16 the plot becomes very different from Curtis Fig. 2.

input[34,1] <- 0.17 #Dominance coefficient in A
input[37,1] <- 0.0016   #Dominance coefficient in B

Sent email to Ian & Beth.

Solved, apologies, my mistake. The equation is correct, but I mistyped the raw value for RS survival from Curtis. It is 0.07% so should be 0.0007 :

0.0007 is 1 mosquito surviving from 1500, seems a bit odd too (looking up original paper it was 2/7392)

Rawlings1981 that the data came from, good paper, may have other stuff we can use : There was no heterozygote survival at the former dose, and only 2 out of 7392 survived on exposure to the lower dose of HCH-0.027% or 0.07% when corrected for the survival of the heterozygote unsprayed controls. This estimate has an extremely high standard error, since it is based on only two survivors.

Note that this was for different anopheles in Pakistan.

inputs_from_gen_fit() function to calc inputs from these data :

15/12/17 Kolaczinski2000 pyr & non pyr mortalities for RR/RS/SS/ KDR Table3 & 4

saved table as csv in extdata

Chandre2000 I can use % knockdown

Could combine these 2 Kolaczinski2000 pyr & ddt Essandoh2013 carbamate & Ace1

1 Kolaczinski2000 etofenprox (pyr) kdr eff 0.43, rr_ 0.63, dom 0.30

2 Essandoh2013 An. gambiae bendiocarb Ace1, eff 0.98, rr_ 0.84, dom 0.66

thought I coukd take frequency from the field data but it's already v high, e.g. 0.8 for pyr & 0.2 for carb

I'm tempted just to put 2 plots in, or maybe just one.

I1 pyr red

I2 car blue

same starting frequencies, resistance to CAR rises quickly because it has high effectiveness and dominance, mixture better than the sequence because car provides high protection to the pyr

etofenprox

runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.3 , h.RS2_0B = 0.66 , exposure = 0.5 , phi.SS1_A0 = 0.43 , phi.SS2_0B = 0.98 , rr_restoration_ins1 = 0.63 , rr_restoration_ins2 = 0.84 , addCombinedStrategy = FALSE, strategyLabels = c('s','','a','m'), cex.axis = 0.8, addLegend=FALSE, labelMixSeqRatio = 1 )

start pyr same and car lower

mixture still slower

runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.001 , h.RS1_A0 = 0.3 , h.RS2_0B = 0.66 , exposure = 0.5 , phi.SS1_A0 = 0.43 , phi.SS2_0B = 0.98 , rr_restoration_ins1 = 0.63 , rr_restoration_ins2 = 0.84 , addCombinedStrategy = FALSE, strategyLabels = c('s','','a','m'), cex.axis = 0.8, addLegend=FALSE, labelMixSeqRatio = 1 )

exposure 0.8

mix & seq now the same

runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.3 , h.RS2_0B = 0.66 , exposure = 0.8 , phi.SS1_A0 = 0.43 , phi.SS2_0B = 0.98 , rr_restoration_ins1 = 0.63 , rr_restoration_ins2 = 0.84 , addCombinedStrategy = FALSE, strategyLabels = c('s','','a','m'), cex.axis = 0.8, addLegend=FALSE, labelMixSeqRatio = 1 )

18/12/17

Ian sent better calculation for cost.

Can I derive cost from the field data ?

Kolaczinski2000 Alpha-cypermethrin (pyr) including control mortality. eff 0.29, rr_ 1, dom 0.54

runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01 , h.RS1_A0 = 0.54 , h.RS2_0B = 0.66 , exposure = 0.5 , phi.SS1_A0 = 0.29 , phi.SS2_0B = 0.98 , rr_restoration_ins1 = 1 , rr_restoration_ins2 = 0.84 , addCombinedStrategy = FALSE, strategyLabels = c('s','','a','m'), cex.axis = 0.8, addLegend=FALSE, labelMixSeqRatio = 1 )

done 1. calc cost from Kolaczinski2000 cost 0.19 domcost 1 (truncated from 2) Can I pass cost too ? z.RR1_00 = 0.19 How to set dominance of cost ? i think h.RS1_00 Dominance coefficient locus1 in 00

in setInputOneScenario() input[ 32 ] <- h.RS1_00 in runModel2.r a_dom[1,'no'] <- input[32,scen_num] in fitnessSingleLocus.r a_fitloc[ paste0('RS',locusNum), 'no'] <- 1 - (a_dom[locusNum, 'no'] * a_cost[locusNum])

So yes it is used in calc.

without cost

with cost, messes up the plot completely, because insecticide1 the frequency declines under the cost

runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01, z.RR1_00 = 0.19, h.RS1_00 = 1, h.RS1_A0 = 0.54 , h.RS2_0B = 0.66 , exposure = 0.5 , phi.SS1_A0 = 0.29 , phi.SS2_0B = 0.98 , rr_restoration_ins1 = 1 , rr_restoration_ins2 = 0.84 , addCombinedStrategy = FALSE, strategyLabels = c('s','','a','m'), cex.axis = 0.8, addLegend=FALSE, labelMixSeqRatio = 1 )

when dominance of cost is reduced to 0.5 it does produce an instructive plot where the freq declines to start and then increases, makes mix much better than seq.

runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01, z.RR1_00 = 0.19, h.RS1_00 = 0.5, h.RS1_A0 = 0.54 , h.RS2_0B = 0.66 , exposure = 0.5 , phi.SS1_A0 = 0.29 , phi.SS2_0B = 0.98 , rr_restoration_ins1 = 1 , rr_restoration_ins2 = 0.84 , addCombinedStrategy = FALSE, strategyLabels = c('s','','a','m'), cex.axis = 0.8, addLegend=FALSE, labelMixSeqRatio = 1 )

etofenprox also declines to loss

runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01, z.RR1_00 = 0.19, h.RS1_00 = 1 , h.RS1_A0 = 0.3 , h.RS2_0B = 0.66 , exposure = 0.5 , phi.SS1_A0 = 0.43 , phi.SS2_0B = 0.98 , rr_restoration_ins1 = 0.63 , rr_restoration_ins2 = 0.84 , addCombinedStrategy = FALSE, strategyLabels = c('s','','a','m'), cex.axis = 0.8, addLegend=FALSE, labelMixSeqRatio = 1 )

Alpha-cypermethrin (pyr) incl cost, exposure 0.8

runcurtis_f2( max_gen=500, P_1 = 0.01 , P_2 = 0.01, z.RR1_00 = 0.19, h.RS1_00 = 1, h.RS1_A0 = 0.54 , h.RS2_0B = 0.66 , exposure = 0.8 , phi.SS1_A0 = 0.29 , phi.SS2_0B = 0.98 , rr_restoration_ins1 = 1 , rr_restoration_ins2 = 0.84 , addCombinedStrategy = FALSE, strategyLabels = c('s','','a','m'), cex.axis = 0.8, addLegend=FALSE, labelMixSeqRatio = 1 )

old way of doing table

Insecticide/Mutation|survival RR|survival RS|survival SS|Effectiveness|Resistance restoration|Dominance of restoration ---|---|---|---|---|---|--- Pyrethroid/Kdr|0.84|0.65|0.57|0.43|0.63|0.30 Carbamate/Ace1|0.84|0.56|0.02|0.98|0.84|0.66

2/1/2018

fig1new of the 9 genotypes

create in word, copy image to powerpoint, save as tiff, reimport to Rmd (but can just submit tiff from ppt directly to journal) thought about trying to do with kable, probably possible but bit of a rush now ...

3/1/2018

Final discussion revisions.

modifications to Rmd for resubmission

done ~ add fig11 (field data) done ~ add text from field_data_in.Rmd done ~ add new fig 1, was made in word, I might need to paste into ppt to save as a tiff done ~ renumber figs in legends done ~ renumber fig references in text done ~ renumber tables (new tables just added on end) done ~ deal with typos already sorted in word version done ~ sort intro to Fig 1 & mention of 9 genotypes done ~ add in new text sections done ~ italicise species names in references (probably in Mendeley), Bhatt & Tallebois done ~ add references to R packages

letter to editor

done ~ add in reviewers text & put responses in

mods in word post knitting before submission :

done ~ delete figures done ~ Ctrl A, Page layout, line nums, double space. done ~ Insert, page number. done ~ check that tables and other sections don't cross pages but do not use page breaks in your manuscript done ~ modify any table column widths that need done ~ reduce font of footer to Table 6 done ~ search for in references and italicize species names done ~ save an accepted changes version FINAL READ THROUGH done ~ 'compare' to previous submitted version done ~ accept formatting changes

12/1/18 ~ created new shiny app to include cost and to be cited from MJ paper : resistmixseq

dominance of cost :

' @param h.RS1_00 Dominance coefficient locus1 in 00

' @param h.RS2_00 Dominance coefficient locus2 in 00

h.RS2_00 = 0 ,

Ooooo ! by default they are set to 0, so that cost is completely recessive. Maybe I should set the default to 0.5. I wonder if this would make major diff. to the paper2 figs ??

Try adding it to the UI now so I can explore.

seems probably that dominance of cost doesn't have major effect. In which case for the paper I think I should just not mention results of dominance of cost ?

15/1/18 Making last changes to MJ paper : Ians suggested add to Table 2 : A value of zero means the resistant allele is recessive (fitness of SR genotype is identical to that of SS) while a value of 1 indicated the resistance allele is dominant (fitness of SR is the same as RR)

Reformatting references : New version of csl file downloaded from here : https://github.com/citation-style-language/styles/blob/master/biomed-central.csl

e.g. it has et-al max as 6

done ~ reknit done ~ copy refs 8 onwards into docx done ~ check for any done ~ remove page numbers or doi ~ add accessed date to web pages

Modifying colour of hr() lines in shiny plots https://stackoverflow.com/questions/43592163/horizontal-rule-hr-in-r-shiny-sidebar

fighting with cowplot error on shinyapps.io Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called �cowplot�

its not used in the app, but is used in plot_fit_variance.r ???

I tried resinstalling from Cran

Is it related to the msg I get when building ? Warning message: replacing previous import ‘cowplot::ggsave’ by ‘ggplot2::ggsave’ when loading ‘resistance’

removed an import in the roxygen for a function

Aha now it has moved onto : gridExtra

solution was to have in Description:imports AND in namespace via roxygen import or importFrom

tweaked UIs

resistmixseq : choice to run

added About tab to resistmixseq

Now why is shinyapp not working ? It makes wait a very long time and then : ERROR: An error has occurred. Check your logs or contact the app author for clarification

The logs don't clarify. Does it need to reinstall the resistance package each time, in which case I should probably cut down examples etc. ?

does install github work locally ? yes and runUImix works locally ??

Aha!, Once logs refreshed : Warning: Error in loadNamespace: there is no package called ‘markdown’

added this above runUImix()

' @import shiny markdown

18/1/2017 ~ MJ paper copyedited version resubmitted http://MALJ.edmgr.com/ s.. w77 Manuscript Number: MALJ-D-17-00632

23/1/2017

I'm looking into rotations and mostly I find very little difference from sequences even with costs or refuges. In a sequence resistance builds up once and then declines, in rotations it builds up in steps and declines in between but the time-to-resistance is the same for both.

Then I refound this idea in GPIRM that costs of resistance are likely to decline when frequency of resistance gene has been high.

"Insecticide resistance management strategies must, however, be implemented before the resistance gene becomes common and stable in the population; otherwise, the resistant gene will not recede even if use of the insecticide causing selection pressure is discontinued. As demonstrated by a study of blowflies by McKenzie and Whitten in 1982 (3), fitness cost is not an intrinsic property of the gene. Therefore, if that gene is allowed sufficient time to become common in a population, the rest of the genome will adapt to incorporate it without a significant fitness cost. At this point, even if the selection pressure is removed, the resistance gene will remain in the population."

If this was the case it would probably favour rotations over sequences.

Do you have any thoughts ? Do you think it likely that costs will decline over time at high resistance frequencies ?

I looked at the 1982 paper (attached) and I don't think it supports the idea that costs decline over time (but it does seem to have data on relative fitness of RR,RS,SS that we could use in the WoS paper, gives evidence for dominance changing over time and could be used to derive our model parameters.)

Thanks Andy

That results should put the cat amongst the pigeons i.e. little difference between sequences and rotations.

There are examples of “compensatory” mutations offsetting the fitness cost of a mutation. Often from bacteria where populations can be very high and the population is asexual so that the mutation and modifier are always linked. It is less well studied in larger animals. There may be examples from IR but I am unaware. Often it is a post hoc explanation plucked out the air to explain why there is no apparent fitness effect i.e. “there were fitness effect, but they must have been modified”.

I like the 1982 paper… I have put it in the WoS folder with a note to discuss it.

Hi Ian,

I've been thinking about the link between the wos data and our modelling work.

Is it right to say the following :

In the WoS plots, where the mortality of the SS changes over time this is a change in insecticide effectiveness. Where the mortality of the RR changes over time (and the SS is not changing) this suggests a change in Resistance Restoration.

Currently in our model both effectiveness and resistance restoration are kept constant over time and we don't talk about them changing. Is that something we want to think about ?

Andy

Hi Andy

That is all true. Its why geneticists always stress their estimates are for a specific mutation in a specific environment, the latter being insecticide concentration in our case.

But we can think of fitness and restoration as being mean values i.e. averaged across the whole range of concentrations.

Dos that sound reasonable?

~ created simple annotated graphic of pap2 fig 6 (for tweeting & presentations) see summary_fig_pap2_labelled.png, which is saved from the ppt after the graphic bit created in the similarly named Rmd.

From Groeters 2000 The most important practical implication of this study comes from simulation results showing that the ability of refuges to delay evolution of resistance is not inßuenced greatly by the relative contributions of major and minor genes for resistance (Fig. 5). Instead, under the assumptions of each of the four models of inheritance examined (i. e., models AÐD), refuges were more effective when selection was intense (1% survival) than when selection was weaker (10 or 50% survival) (Fig. 5). Thus, for predicting the success of refuges, understanding of selection intensity is critical, but knowledge of the number of loci and their relative contributions to resistance is not.

LSM is larval source management

25/1/2018

Anopheles sinensis genome database : http://www.asgdb.org/index.php/maps

26/1/2018

Kafy 2017, impact of IR in Sudan :

A positive finding was modest evidence of retardation in the speed of evolution of insecticide resistance when two active ingredients with differing modes of actions were used in the LLIN + IRS arm. This is important for malaria control program managers as they struggle to develop plans for the monitoring and management of insecticide resistance in line with WHO GPIRM recommendations (6). Curiously, across our study site, there was a significant decrease in the Vgsc-1014F resistance marker frequency. While there are a number of instances of kdr markers sweeping rapidly to fixation (30–32), the obverse trend shown here has not been reported elsewhere. There are numerous studies showing that in An. arabiensis, Vgsc-1014F is a strong predictor of pyrethroid resistance (33), so this may suggest a decline in its importance in conferring a resistant phenotype due to the emergence of additional resistance mechanism(s).

29/1/2018 after workshop Dave wheetman was concerned that Bayer had misinterpreted our results because pyrethroids are at very high starting frequencies.

But when I tried setting starting frequency to 1 it does show that the pyrethroid still does provide some protection to the new AI. runcurtis_f2( max_gen=500, P_1 = 1 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.5 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , z.RR1_00 = 0 , z.RR2_00 = 0 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') ) I should send that plot to Dave.

runcurtis_f2( max_gen=500, P_1 = 1 , P_2 = 0.01 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 0.5 , phi.SS2_0B = 0.5 , rr_restoration_ins1 = 0.5 , rr_restoration_ins2 = 0.5 , z.RR1_00 = 0 , z.RR2_00 = 0 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') )

plus I can change max starting freq from 0.1 to 1 in resistmixseq

31/1/2018 If you set frequency for an insecticide to 1 that insecticide can still protect the other in a mixture, unless you set resistance restoration to 1 too. In the later case no RR will be killed.

runcurtis_f2( max_gen=500, P_1 = 1 , P_2 = 1e-04 , h.RS1_A0 = 0.5 , h.RS2_0B = 0.5 , exposure = 0.5 , phi.SS1_A0 = 1 , phi.SS2_0B = 1 , rr_restoration_ins1 = 1 , rr_restoration_ins2 = 0.5 , z.RR1_00 = 0 , z.RR2_00 = 0 , addCombinedStrategy = FALSE, strategyLabels = c('seq','','adapt','mix2') )

Ah yes I get it. With resistance alleles at fixation the whole popn is RR for the one insecticide. Therefore when resistance restoration is 1, no RR are killed and it doesn't matter what the dominance level is (because there are no SR). In that case the effectiveness level probably makes no difference either.

So when an allele is at fixation I would expect resistance restoration to be the key parameter in determining how much protection that old insecticide can provide to a new one. But is it OK to use a single gene representation for the old insecticide at fixation even if there are multiple genes ?

In practice might including a pyrethroid within the mixture cause the resistance phenotype (intensity?) to increase over time as more genes are selected for.

12/2/18 Can the model do assortative mating ? For luca and following the Rowland1991 paper I think I would need to modify the createGametes() function.

As a part of that he has to outline follow on funding and outlines a 5 year BBSRC fellowship application that includes :

4) development of a mathematical model merging fitness data, sub-lethal effects of insecticides, and parasite-vector interactions. The goal is to predict the vector role of populations having different insecticide resistance mechanisms and will be available as an open-source tool on the LSTM website;

I suggest he could replace that with : 4) incorporation of results on assortative mating and fitness costs into an existing population genetics model of the evolution of insecticide resistance;

From Sternberg & Thomas 2017 insights from agriculture I think our model disagrees with the 2nd sentence that mixtures require resistance alleles to be rare and fully recessive.

Co-formulations can be difficult to develop and mixtures only offer an advantage as long as the component insecticides both continue to work, such that the efficacy of mixtures for controlling resistance depends on the co-persistence of the component insecticides. Additionally, resistance alleles in the insects must be rare and fully recessive, to avoid strong selection for double heterozygous individuals and the rapid evolution of resistance to both insecticides (Curtis, 1985; Tabashnik, 1989). Similar caveats exist for the use of mosaic and rotation strategies, and modeling efforts aimed at evaluating different strategies for insecticide resistance management have demonstrated that the outcomes are highly sensitive to parameters that are system specific (Lenormand & Raymond, 1998; Slater, Stratonovitch, Elias, Semenov, & Denholm, 2016), such that there is no universal best strategy.

Because of the situation-specific nature of insecticide resistance management plans, public health may have an advantage over agriculture. Agricultural pest management is concerned with hundreds of species, and in many cases, detailed data are not available for each pest species (and potential nontarget organisms). Compared to agriculture, public health is concerned with a small number of insect species. This should enable a strongly data-driven approach to selecting resistance management strategies, but in reality, very little is currently known. GPIRM does highlight the success of a rotation-based management strategy in one public health example: the West African Onchocerciasis Control Program (OCP).

2.3 | Summary • Insecticide resistance management (IRM) strategies are urgently needed for vector control.

• The feasibility of IRM depends on having multiple insecticides and vector control tools; vector control is currently dependent almost exclusively on a single insecticide class.

• The success of an IRM strategy is highly dependent on ecological and biological specifics; there is not yet a solid evidence base of recommending one strategy over another.

• If control is achieved via the use of multiple tools with diverse modes of action, not only will selection for insecticide resistance be reduced but also, the impact of resistance will likely be less severe. The pending resistance crisis creates an urgent need to develop and implement integrated, multitactic IVM strategies that parallel IPM in agriculture.

Because it is so difficult to achieve adequate sensitivity, in many cases by the time resistance is detected, it will already be too late to implement resistance management measures and slow the spread of resistance.

However, difficulties with monitoring resistance need not prevent the implementation of IRM strategies and it is possible, and perhaps even preferable, to implement strategies that do not rely on early detection of resistance. Although the OCP did conduct frequent entomological surveys on black flies, the program relied on set rotation program determined beforehand based on cost as well as other factors (Hougard et al., 1993). In other words, the decision on which insecticide to use was not based on the detection of resistance, other than the initial observation that control with a single insecticide was failing due to resistance (Curtis et al., 1993).

The best strategy may be to use resistance monitoring to inform which insecticides can be used in a rotation, but not when they should be used.

Perhaps the most effective persistence profile from a resistance management perspective is a sustained high dose with rapid decay—in other words, “hit them hard or not at all” (REX Consortium, 2013).

Efforts to develop long-lasting products for vector control might be counterproductive with regard to resistance management if the result is a long and shallow decay curve. The ideal persistence profile from a resistance management perspective needs to be balanced, however, with the need to provide continuous protection from disease. A product with rapid decay could leave people unprotected from disease if too much time elapses between treatments.

In public health, population growth does not necessarily translate to more disease transmission. For example, increasing densities of vector populations might influence phenotypic traits that make better or worse disease vectors (Moller-Jacobs, Murdock, & Thomas, 2014; Russell et al., 2011; Shapiro, Murdock, Jacobs, Thomas, & Thomas, 2016).

To our knowledge, there is only one field trial aimed specifically at testing resistance management strategies against malaria mosquitoes (Penilla et al., 2007). Perhaps worryingly, this trial revealed no clear benefit of either insecticide rotations or mosaics in slowing the spread of resistance when compared to monotherapy. However, these results do not necessarily mean that rotations or mosaics cannot be effective, but rather highlights a critical gap in understanding when, how, and why such strategies might work.

Penilla, R. P., Rodríguez, A. D., Hemingway, J., Trejo, A., López, A. D., & Rodríguez, M. H. (2007). Cytochrome P450-based resistance mechanism and pyrethroid resistance in the field Anopheles albimanus resistance management trial. Pesticide Biochemistry and Physiology, 89(2), 111–117. https://doi.org/10.1016/j.pestbp.2007.03.004.

Field testing of IRM strategies is difficult but necessary. GPIRM makes numerous references to approaches such as rotations, mixtures, or mosaics, as if they are proven strategies and have equal merit. At present, there are no sound empirical data to inform the selection of any one strategy over another.

Arguably, one of the most successful resistance management program in agriculture is the management of Bt resistance in genetically modified (GM) crop in the United States (Bates, Zhao, Roush, & Shelton, 2005; Tabashnik, Gassmann, Crowder, & Carriére, 2008; Tabashnik et al., 2003). This is a high-dose/ refuge strategy based on expression of high doses of Bt toxin within GM crops, combined with non-Bt refuges, which can be non-GM cotton or corn or even another plant used by the target pest. The high dose of Bt is designed to kill fully susceptible and heterozygote-resistant insects. Any homozygote-resistant individuals will be initially rare, and there is a high probability that they will mate with the numerically dominant susceptible individuals emerging from the non-Bt refuge. These matings result in heterozygotes that are functionally susceptible to the Bt crop, effectively clearing resistance alleles from the pest population. Factors that have likely contributed to the success of this strategy include low initial frequencies of resistance alleles, recessive inheritance of those resistance alleles, and fitness costs associated with resistance (Tabashnik et al., 2003).

16/2/18 Dengela 2018 PMI data : Multi-country assessment of residual bio-efficacy of insecticides used for indoor residual spraying in malaria control on different surface types: results from program monitoring in 17 PMI/USAID supported IRS countries

Insecticide residual activity data collected using wild mosquito populations from areas of known or suspected resistance were not included in this report.

Malaria journal paper out yesterday ! https://doi.org/10.1186/s12936-018-2203-y

Mathematical polygenic paper : Multi-gene-loci inheritance in resistance modeling DirkLangemann OttoRichter AntjeVollratha https://doi.org/10.1016/j.mbs.2012.11.010

14/3/18

Composing few paragraphs on IRM for the WHO insecticide resistance report.

"If control is achieved via the use of multiple tools with diverse modes of action, not only will selection for insecticide resistance be reduced but also, the impact of resistance will likely be less severe." (Sternberg, 2017)

"Field testing of IRM strategies is difficult but necessary. GPIRM makes numerous references to approaches such as rotations, mixtures, or mosaics, as if they are proven strategies and have equal merit. At present, there are no sound empirical data to inform the selection of any one strategy over another." (Sternberg, 2017)

Rex consortium review

REX Consortium. Heterogeneity of selection and the evolution of resistance. Trends Ecol Evol. 2013;28:110–8.

Responding to Grahams request about Sudo 2017 :

Hi Graham, Dave,

Apologies for the delayed reply, I was in Peru last week doing some GIS training. I gave a short talk to the insecticide resistance group there about our work, first time I've talked about it in my rough Spanish. They were interested in potential solutions to the resistance issues they are facing.

Thanks for forwarding the Sudo 2017 paper on. We saw it just after we had submitted our Malaria Journal paper. We will refer to it when we are writing up our rotations work.

There are some similarities to our work and some differences. Firstly in relation to rotations which we are working on now they use a tighter definition of rotation as changing every generation which is of less relevance for vector control.
"The rotation strategy, which uses one insecticide on one pest generation and a different one on the next"

They use two patches, one treated and one a refuge as we do for rotations. They also look at pest suppression as well as the evolution of resistance which is interesting.

Their model is more detailed than ours in terms of life history. This has advantages that they can change life history parameters, but disadvantages the model is more complicated and they have to specify parameter values for which there may be few data.

They apply insecticides during juvenile and/or adult stages allowing them to represent larviciding and adult control.

"In the general model, populations pass through eight events, sequentially, (I) juvenile selection, (II) densitydependent survival, (III) pre-mating dispersal, (IV) pre-mating selection, (V) mating, (VI) post-mating dispersal, (VII) post-mating selection, and (VIII) oviposition."

In our mixtures papers we show insecticide effectiveness to be the most important input determining whether mixtures or sequences are favoured. It seems they only run their model with two values of effectiveness 1 and 0.9 (they express this as survival s of 0 and 0.1). At these high values of effectiveness it is consistent with our work that they find mixtures to be favoured most of the time.

I'll look at this again in more detail when we have our rotation results. We may be able to look at Table 2 which specifies the different life histories and when insecticide selection occurs to work out which of their scenarios are most applicable to mosquitoes.

Their code is available and I downloaded it a while ago for a quick look so we have the potential to run their model for comparison, however the code appears quite complicated and not very well documented so this may not be worthwhile.

Best wishes,

Andy

5/6/2018 adding plot_wos() to generate idealised diagrams of windows of selection

want to use it to generate inputs for single insecticide resistance runs to calculate time-to-resistance.

paper: Brooke2010 Major effect genes or loose confederations? The development of insecticide resistance in the malaria vector Anopheles gambiae Good clear summary of evidence about single v multi-gene & costs & dominance in An gambiae

In An. gambiae, insecticide resistance phenotypes usually develop under the control of single major genetic factors. Those factors involving mutations in target site loci are likelier to reduce fitness and are only advantageous to carriers in the presence of insecticide. Selection generally acts against these alleles and they tend to drift out of populations in the absence of insecticide. However, a combination of factors producing a single resistance phenotype also occurs in some instances. These factors invariably involve metabolic detoxification, are less likely to reduce reproductive and physiological fitness in carriers, and tend to be stable over time, even in the absence of insecticide selection.

11/6/2018 Ian concerned that resistance thresholds aren't reached almost immediately when mortality of SS & SR is 100% (i.e. on left of wos plots).

I reran with exposure set to 0.99 and then indeed the time-to-resistance is super-quick for the whole of the window of selection.

(Note that the sim fails if exposure set to 1) First warning is : In selection(a_gtypes = a_gtypes, a_fitgen = a_fitgen, calibration = calibration) : m genotype frequencies after selection total != 1 NaN

reorganising plot_wos()

wos_diagram() dfout <- wos_sim() wos_plot_timetor(dfout) wos_plot_input(dfout, input='dom_resist', label='dominance')

12/6/2018 reorganising wos_diagram()

13/6/2018 Code from Ian to calculate selective advantage for WoS paper :

user-defined inputs

exposed=0.5# The proportion exposed R_freq=0.001; #frequency of the resistance allele. this will affect results if resistance is recessive beacuse it determines the proportion of R allelels in RR genotypes fitness_RR=0.9 fitness_RS=0.2 fitness_SS=0.1

now the calculations:

R_next_gen=exposed(R_freqR_freqfitness_RR+2R_freq(1-R_freq)0.5fitness_RS)+ (1-exposed)(R_freqR_freq+2R_freq(1-R_freq)0.5);

S_next_gen=exposed(2R_freq(1-R_freq)0.5fitness_RS+(1-R_freq)(1-R_freq)fitness_SS)+ (1-exposed)(2R_freq(1-R_freq)0.5+(1-R_freq)(1-R_freq));

R_freq_next_gen=R_next_gen/(R_next_gen+S_next_gen)

selective_advantage=(R_freq_next_gen/R_freq)-1 #so if no increase, advantage=0

wos_advantage()

Now within wos_diagram() I could offer to call wos_advantage() instead (or in addition) to wos_sim()

(and I could generalise wos_plot_timetor() to be able to plot advantage)

aha, wos_plot_timetor() is already generalised to accept x & y

plotting log selective advantage gives clearer mirror image of time-to-resistance at startfreq 0.01 but not very good at 0.001.

14/6/2018

Ian suggested relative fitness, but log of that goes back to not giving a good prediction of time-to-resistance at startfreq 0.01

for selective advantage and relative fitness see wos_diagram(adv=TRUE)

remember that when rr_cost > 0 this can cause runModel2 to crash

~ run wos plots for the field data

wow sorted tidyeval : enquo : to quote a passed unquoted variable within a function !! : to unquote a quoted variable for dplyr

wow2 : ggplot theme vars can be accessed and applied to other objects using theme_update gg_gt$theme$legend.position

19/6/2018 done ~ get wos_sim() and wos_advantage() working better together done ~ get wos_advantage() to work when no SR

20/6/2018 see recent paper suggesting costs leading to loss of pyrethroid resistance in 10 generations in Aedes : Grossman2018 phenotypic susceptibility can be restored in a highly resistant field-derived strain of Aedes aegypti in only 10 generations through rearing them in the absence of insecticide. They also investigate frequency of different kdr mutations, picture is not starightforward, seems there may be a cost to just one.

21/6/2018

TO DO

~ get wos_plot_sim() to accept two columns so that can put starting frequency on same plot

~ ?add some checks into runModel2()

~ what exposure value should we use for wos ? Remember that male_proportion is prob set to 1 by default, maybe set male to 0 ? and what paper did Ellie recomend ? Griffin et al 2010 PLOS.

~ for the field data where we don't have heterozygotes can we show upper and lower limits for time-to-resistance at dominance 1 and 0.

~ see p1593 in Bourguet2000 Dwt dominance in presence of insecticide, Dwnt dominance in absence of insecticide : In conclusion, three dominance measures related to resistance have been used in insecticide resistance studies. These dominance measures reßect three different phenotypic traits: the insecticide concentration required to give a particular ML, theMLat a particular insecticide dose, and the Þtness in treated areas. To distinguish clearly among these dominance measures we refer to them as DLC, DML, and DWT, respectively. All these dominance measures can be estimated by the same general formula given above. Other dominance measures related to an insecticide resistance allele, such as the dominance of fitness in untreated areas (DWNT) can also be calculated with this simple formula. All these dominance measures vary between 0 (complete recessivity) to 1 (complete dominance). ... No study of the genetics of insecticide resistance has yet focused on calculation of the dominance measure DWT. We strongly suggest that the time has come to focus on the estimation of DWT and DWNT. We are aware that such estimations will be a difÞcult task notably in natural populations.

~ post paper 2 resubmission :

~ can I design data structures of resizeable arrays that could cope with changing number of insecticides ?

~ set aside a day to go through resistance (& rotation) code with Ian ~ I feel that we need to be able to reproduce rotation results withing the resistance model, if just for 2 insecticides ~ do we definitely need the 3 niches ? we haven't really used yet.

~ reaquaint myself with resistance code and improve documentation and testing

~ think about SSI code review grant, we could submit with what we want to change. While it's in review I could focus on rotations.

~ how can we represent a rotation within the existing resistance code ? ~ if it was sufficiently easy then maybe we should focus on that rather than the new method.

~ check plot_fit_variance() & what it would do if multiple scenarios

~ rationalise what is happening with genotype freq calculations in runModel2 (e.g. think I may be able to remove the 'genotype' matrix). This should help me do the fitness variance calculations that Ian wants for both when LD=0 and LD is present.

~ I think I should probably change the model to be able to represent sequences directly rather than through post-processing. Quite a big job, maybe a week or more.

Fitness calc. I need to work out how to multiply these together. a_fitgen or df_indiv['m',] listOut$genotype[[1]]

and how to output them so that we can use ? can I add it to the list object ? Would this cause any issues with existing outputs. I may even be able to replace the existing fitness output if we don't use it for anything else ?

one issue is to mean RS1RS2_cis RS1RS2_trans to RS1RS2

I think it my be similar to this bit of code in selection.r. although I think this just calculates fitness by M&F.

# W bar - Sum of numerators W.bar <- createArray2(sex=c('m','f')) for( sex in dimnames(a_fitgen)$sex) { for( locus1 in dimnames(a_fitgen)$locus1) { for( locus2 in dimnames(a_fitgen)$locus2) { #have to do cis/trans specially if ( locus1=='RS1' & locus2=='RS2' ) { W.bar[sex] = W.bar[sex] + (a_gtypes[sex,'RS1RS2_cis'] * a_fitgen[sex,locus1,locus2]) W.bar[sex] = W.bar[sex] + (a_gtypes[sex,'RS1RS2_trans'] * a_fitgen[sex,locus1,locus2]) }else { W.bar[sex] = W.bar[sex] + (a_gtypes[sex,paste0(locus1,locus2)] * a_fitgen[sex,locus1,locus2]) } } } }

~ check whether my R-S calc in calc_fit_rs() is similar to what happens in selection()

Could I test this SR mean fitness difference as a predictor of time-to-resistance in the full simulation results ? I can probably calculate it for the 10,000 runs with a little work. Would probably just need to write a short script that queries ggSubset or similar to get input values and creates a new field to store the new potential predictor.

List of other tidying things todo if/when I get proper time (oct 2016) ~ can runModel2() accept exposure over time input to be able to do sequences ? ~ spend a day deciding whether to rewrite parts from scratch or to continue improving ~ check make.genotypemat() probably replace with more efficient code

Macromosaics

~ Start with 2 popns ~ allow for more if possible ~ immigration & emigration specified at each time step. ~ allow for change to 3 loci. ~ i think transfer should be between the genotype frequency arrays (f & fs for before & after selection) ~ i think the best way to do this would be within a successor to runModel2() where there is effectively a loop near the start, for(pop_num in 1:num_pops) & migration is done if num_pops > 1. ~ there could be a single migration 'exchange' input for 2 pops. Would get trickier for 3+ pops. If just done as exchange: Num pops : num ex : exchanges 2 : 1 : 1-2 3 : 3 : 1-2, 1-3, 2-3, 4 : 6 : 1-2, 1-3, 2-3, 1-4, 2-4, 3-4 5 : 10 : 1-2, 1-3, 2-3, 1-4, 2-4, 3-4, 1-5, 2-5, 3-5, 4-5

num exchanges = f(num_pops)

3+ loci

~ it is a big job certainly weeks rather than days (> 4 weeks, important to plan first, & maybe run the plan past another coder, SSI, even apply for a SSI code review grant ?) ~ might be better to plan from scratch ? ~ would want to decide whether to keep 2 locus model & rewrite a 3 locus, or whether, ideally, could have run_model(num_loci=2) ~ most functions would need to be re-written incl. : createGametes(), fitnessIndiv(), fitnessNiche(), randomMating(), genotypeLong2Short() ~ currently there is some redundancy in dimension naming, e.g. using SS1 & SS2, rather than could have, locusNum=c(1,2) haplotype=c('SS','RS','RR') ~ array (probable) changes : exposure (a) from : sex=c('m','f'), niche1=c('0','a','A'), niche2=c('0','b','B') to : sex=c('m','f'), niche1=c('0','a','A'), niche2=c('0','b','B'), niche3=c('0','c','C')

fitness by locus from : loci=c('SS1','RS1','RR1','SS2','RS2','RR2'), exposure=c('no','lo','hi') to : n_insecticides <- 4 loci <- paste0(c('SS','RS','RR'),rep(1:n_insecticides,each=3)) a_fitloc <- array_named( loci=loci, exposure=c('no','lo','hi') ) OR as in rotations a_fitloc<- array_named(insecticide=1:n_insecticides, genotype=c('SS','RS', 'RR'), exposure=c('no','lo', 'hi'))

fitness by niche from :
locus1=c('SS1','RS1','RR1'), locus2=c('SS2','RS2','RR2'), niche1=c('0','a','A'), niche2=c('0','b','B')
to : locus1=c('SS1','RS1','RR1'), locus2=c('SS2','RS2','RR2'), locus3=c('SS3','RS3','RR3'), niche1=c('0','a','A'), niche2=c('0','b','B'), niche3=c('0','c','C')

fitness by individual from : sex=c('m','f'), locus1=c('SS1','RS1','RR1'), locus2=c('SS2','RS2','RR2') to : sex=c('m','f'), locus1=c('SS1','RS1','RR1'), locus2=c('SS2','RS2','RR2'), locus3=c('SS3','RS3','RR3')

an easy bit 4+ arrays that have these dimensions can be changed from : locusNum=c(1,2), exposure=c('no','lo','hi') to : locusNum=c(1,2,3), exposure=c('no','lo','hi')

runModel2 tidying

~ I want to put the input reading into a function ~ trickines is that inputs are mostly read into single named variables so can't all be returned from a single function unless I return them as a list ? Maybe do that as in_ or something similar

~ think about whether adding additional plotting options to plotcurtis_f2_generic() and or the UIs (as an optional tab?) could be useful ?

~ can I create a plot of frequency of SS,SR,RR ? see Beths plothaplotype().

~ create a vignette explaining how fitness is calculated, so that it can be stored in the package in future, it can access my new fitness functions.

set up some unit tests for the fitness functions

Scope out 1) 6 subpop macro-mosaics 2) fitness outputs & constraints to when mixtures stopped, (be careful that the rr param determines what the maximum fitness can be ... need to think about that a bit). First step, output mean fitness over time. 3) 3 locus (we suggested Ian & I would need to work on together) not yet 4) rotations

I suggested that best to get the basic code streamlined before trying to add the other stuff on top of it. ~ split up runModel2() e.g. have functions to fill arrays from the input matrix

~ i could try to create a graphical model of how effectiveness, exposure, resistance restoration and dominance effect the development of resistance.

~ we could eventually create a game of it ... effectiveness of the insecticide (image of bottle) exposure (depends on spray coverage and mosquito behaviour) resistance restoration (property of the gene) dominance (property of the gene)

~ write up why having correctDeploy of 0 equates to mosaics ? Effectively it means that half of expsoure is set to one insecticide and half to the other.

~ maybe delete or at least deprecate timeToFifty() & singlealleleFrequency()

~ check why when correctMixDeployProp is set to 0 that the curve for the first insecticide is not the same for the insecticide in single use ? Ian explained that this was as expected, but I need to confirm my understanding.

Potential for me to look at in a later paper : Differential exposure of males & females. (I have shown exposure is important so this could have a big effect).

~ check code in findResistancePointsMixResponsive()

create a vignette that shows how best to run current model

fix the testthat tests to avoid new input addition

Good simple test

input <- setInputOneScenario(max_gen=5) tst2 <- runModel2(input) tst <- runModel(input)

inputs

a exposure to insecticide W.SS1_00 & W.SS2_00 fitness of SS in no insecticide s selection coefficient phi fitness of one locus (baseline) h dominance coefficient z fitness cost of resistance allele in no insecticide niche_ insecticide niche toggle

outputs

From runModel2() A list of 3 lists of one or more scenarios: 1. results : by generation freq of R allele at each loci in each sex plus linkage stuff 2. genotype : frequencies of each of the ten genotypes, per generation 3. fitness : fitness scores of each genotype/niche combination (table 4. of Main Document) 4. input e.g. listOut$results[1] gives a results matrix for the first scenario listOutMix$input gives the inputs for all scenarios listOutMix$input[c('P_1','P_2'),] gives selected inputs

current best UIs

https://andysouth.shinyapps.io/shinyMixSeqCompare1 https://andysouth.shinyapps.io/shinyFig2Curtis

works on mobile

https://andysouth.shinyapps.io/resistmob2

includes cost, doesn't fit well on mobile

https://andysouth.shinyapps.io/resistmixseq

to sync a fork and get updates

git fetch upstream git checkout master git merge upstream/master

EduRoam-LSTM

left click wifi icon, Network settings Manage wifi settings scroll down to manage known networks and forget Phone EAP:PEAP Phase-2 auth:MSCHAPV2 Scroll down to pswd.

start a new project

create new repo on github

create RStudio project from github repo

To get ssh push working.

RStudio Tools, Shell

git remote set-url origin git@github.com:AndySouth/resistance.git

to fix git problem (if I amend a commit I've already pushed)

git reset HEAD@{1} this takes git back to before the last commit (from ohshitgit.com)

files to do runs

sensiAnPaper1All.Rmd : code to run whole sensitivity analysis & produce plots

sensiAnPaperPart.r : sets up param values and runs for one treatment (e.g. mixture, insecticide1)

setInputOneScenario.r: creates input object for each run

runModel2.r : runs model for the passed scenarios

paper1_results_figs_slimmed_50_rr.Rmd : final figures for paper

brief summary of my current understanding

High doses of insecticide are likely to lead to monogenic resistance because only a gene causing a major change can allow resistant individuals to survive. With monogenic resistance lower insecticide doses are likely to reduce selection presssure and delay the evolution of resistance.

Low doses of insecticide are likely to lead to polygenic resistance because small genetic changes provide some advantage over the sublethal effects and accumulate over time. With polygenic resistance lower insecticide doses allow the accumulation of mutations and promote the evolution of resistance.

Control of weeds in wheat crops in Australia using low doses of herbicides prompted evolution of polygenic metabolic resistance, whereas control of the same weeds in the US using high doses led to the evolution of monogenic target-site resistance. Similarly, low insecticide doses led to polygenic resistance in the Australian sheep blowfly, whereas a higher dose led to the evolution of monogenic resistance (McKenzie, 1994).

AndySouth/resistance documentation built on Nov. 12, 2020, 3:39 a.m.

R Package Documentation

rdrr.io home R language documentation Run R code online

Browse R Packages

CRAN packages Bioconductor packages R-Forge packages GitHub packages

We want your feedback!
Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com

Personal blog

AndySouth/resistance an insecticide resistance population genetics model with 2 loci and 2 insecticides

In AndySouth/resistance: an insecticide resistance population genetics model with 2 loci and 2 insecticides

Structure of the results matrices

control=rpart.control(minsplit=50)

minsplit the minimum number of observations that must exist in a node in order for a split to be attempted.

then prune the tree to avoid overfitting the data

refactored

note a.f_A0 rather than a.f_AB

[1] 0.01

to do re Joe

GOOD RESULTS ! :

1 effectiveness < 0.6 sequential best. > 0.6 mix best.

2 selection coeff < 0.2 sequential best. > 0.2 mix best.

3

' @param correctMixDeployProp proportion of times that mixture is deployed correctly,

' assumes that when not deployed correctly the single insecticides are used instead

replacements done :

previous current state of play 23/3/16

a previous example

putting params of 1st run in

start frequencies of each resistance allele

dominances

exposures

effectivenesses

selection coefficients

start frequencies of each resistance allele

dominances

exposures

effectivenesses

selection coefficients

dominances

exposures

effectivenesses

selection coefficients

dominances

exposures

effectivenesses

selection coefficients

dominances

exposures

effectivenesses

selection coefficients

sequential best

relative to curtis fig2

sequential best. effectivenesses bothg reduced by 0.2 from curtis fig2

trying to make sequential better, high exposure, low phi

Warning messages:

1: In runModel2(i1) : 1 locus fitness values (Wloci) are >1 : 1.2

2: In runModel2(i1) : 7 niche fitness values (Wniche) are >1

3: In runModel2(i1) : 6 individual fitness values (Wloci) are >1

fitness cost of resistance allele in no insecticide : z

sensiAnPaperPart.r

setInputOneScenario.r

runModel2.r

sensiAnPaper1All.Rmd

paper1_results_figs_rr.Rmd copied from paper1_results_figs.Rmd

paper1_results_figs_slimmed50_rr.Rmd

coolio have it working in resistmob2 and it's good being able to explore full range of selective advantage, even setting it to 0 does what I would expect. Reducing selective advantage reduces the rise of resistance in one insecticide, which leads to it being protected by the other in a mixture.

cool contrast between these scenarios modifying effectiveness and rr_restoration

AA effectiveness I1 to 0.8 causes mixture to be slower, although I1 is faster it protects I2 in mixture, making mixture slower

BB rr advantage I1 to 0.8 sequential remains slower, although I1 is faster as in previous it provides less protection to I2 in the mixture

CC rr advantage I2 to 0.2 sequential remains slower (just), I2 slower than I1 in mixture similar to AA but receives less protection in mixture

THIS IS KEY : see my figx3B (to replce MS fig5).

Fig. xx2 Quick check whether difference between effectiveness 1 & 2 dissapears when minimising excluded runs

from new local folder

to view remote repos

I suggest you initially put all your code in a master folder called e.g. matt/

to get fitness results for one run

to get at elements

check W.bar & fs : identical

check genotypes f after random mating : seem identical too

need to do this f[], i should probably rename array from f

seems that by end of generation1 everything is the same ...

but is this just because I'm looking at the first scenario of 3 which is a single insecticide ?

look at the results outputs for each mixture scenario

then may need to go back & browse through the mixture run by calling runModel2 directly

differences are visible in gen2

call runModel2 directly & work backwards in the generation loop to find the difference

thats tricky ...

or do runcurtis_f2() with max_gens set to 2 or 3, & step through scenarios until the 3rd

AndySouth/resistance
an insecticide resistance population genetics model with 2 loci and 2 insecticides