require(knitr) require(formatR) require(dplyr) options(width=200) knitr::opts_chunk$set(cache=FALSE,prompt=FALSE,comment=">",message=FALSE,echo=TRUE,warning=FALSE,tidy=TRUE,strip.white=TRUE,size="small", fig.align = "center",fig.show='hold')
R
In this tutorial on estimating Effect Size Confidence Intervals (ESCI) there are a lot of examples on how to use R
. It was written as an addendum for a post on the Open Science Collaboration Blog, which contains many interesting entries on diverse subjects (like behavioural priming, theoretical amnesia and anonymous peer review)
The solutions to iterating the Linear Map and theLogistic Map in a spreadsheet can be found in this GoogleSheet.
R
Coding the difference equations in Matlab
and R
is always easier than using a spreadsheet. One obvious way to do it is to use a counter variable representing the iterations of time in a for ... next
loop. The iterations should run over a vector (which is the same concept as a row or a column in a spreadsheet: An indexed array of numbers or characters). The first entry should be the starting value, so the vector index $1$ represents $Y_0$.
The loop can be implemented a number of ways, for example as a function which can be called from a script or the command / console window. In R
working with functions is easy, and very much recommended, because it will speed up calculations considerably, and it will reduce the amount of code you need to write. You need to gain some experience with coding in R
before you'll get it right. In order to get it lean and clean (and possibly even mean as well) you'll need a lot of experience with coding in R
,therefore, we will (eventually) provide you the functions you'll need to complete the assignments. All you have to do is figure out how to use, or modify them to suit your specific needs.
To model the autocatalytic growth equations we provide a function growth.ac()
, which is able to simulate all of the processes discussed in the lectures. Using just a few lines of code, each of the 4 difference equations used in the assignments can be simulated. Basically the code block below contains the solutions to the Linear Map, the stylized Logisitc Map and the Van Geert model for cognitive growth.
growth.ac <- function(Y0 = 0.01, r = 1, k = 1, N = 100, type = c("driving", "damping", "logistic", "vanGeert")[1]){ # Create a vector Y of length N, which has value Y0 at Y[1] if(N>1){ Y <- as.numeric(c(Y0, rep(NA,N-2))) # Conditional on the value of type ... switch(type, # Iterate N steps of the difference function with values passed for Y0, k and r. driving = sapply(seq_along(Y), function(t) Y[[t+1]] <<- r * Y[t] ), damping = k + sapply(seq_along(Y), function(t) Y[[t+1]] <<- - r * Y[t]^2 / k), logistic = sapply(seq_along(Y), function(t) Y[[t+1]] <<- r * Y[t] * ((k - Y[t]) / k)), vanGeert = sapply(seq_along(Y), function(t) Y[[t+1]] <<- Y[t] * (1 + r - r * Y[t] / k)) )} return(ts(Y)) } # Call the function with default settings and r = 1.1 Y <- growth.ac(r = 1.1)
Some notes about this function:
type
is defined which takes the values driving
(default), damping
, logistic
and vanGeert
. switch(type, ...)
will iterate an equation based on the value of type
.time series
object is returned due to the function ts()
. This is a convenient way to represent time series data, it can also store the sample rate of the signal and start and end times.plot()
and summary()
will recognise a time series object when it is passed as an argument and use settings appropriate for time series data.sapply()
function iterates $t$ from $1$ to the number of elements in $Y$ (seq_along(Y)
) and then applies the function.<<-
is necessary because we want to update vector $Y$, which is defined outside the sapply()
environment.time series
objectThe time series object is expected to have a time-dimension on the x-axis. This is very convenient, because R
will generate the time axis for you by looking at the time series properties attribute of the object. Even though we are not working with measurement ourcomes, consider a value at a time-index in a time series object a sample:
Start
- The value of time at the first sample in the series (e.g., $0$, or $1905$)End
- The value of time at the last sample in the series (e.g., $100$, or $2005$)Frequency
- The amount of time that passed between two samples, or, the sample rate (e.g., $0.5$, or $10$)Examples of using the time series object.
# Get sample rate info tsp(Y) # Extract the time vector time(Y)
For now, these values are in principle all arbitrary units (a.u.
). These settings only make sense if they represent the parameters of an actual measurement procedure.
It is easy to adjust the time vector, by assigning new values using tsp()
(values have to be possible given the timeseries length). For example, suppose the sampling frequency was $0.1$ instead of $1$ and the Start time was $10$ and End time was $1000$.
# Assign new values tsp(Y) <- c(10, 1000, .1) # Time axis is automatically adjusted time(Y)
ts
object as a time seriesDepending on which packages you use, there will be different settings applied to time series objects created by ts()
. Below are some examples of differences between plotting routines.
require(lattice) # Needed for plotting require(latticeExtra) # Needed for plotting # stats::plot.ts plot(growth.ac(r = -.9), lwd = 2, main = "stats::plot.ts") # lattice::xyplot.ts xyplot(growth.ac(r = -.9), lwd = 2, main = "lattice::xyplot.ts")
Plot multiple timeseries in frames with plot.ts()
in package::stats
.
This function takes a matrix as input, here we use cbind( ... )
.
# stats::plot.ts plot(cbind(growth.ac(r = 0.9), growth.ac(r = 1.0), growth.ac(r = -0.8) ), yax.flip = TRUE, ann = FALSE, col = "blue", frame.plot = TRUE) title(main = expression(paste("Unrestricted Growth: ",Y[t+1]==r*Y[t])), ylab = "| r = -0.8 | r = 1 | r = 0.9 |", xlab = "time (a.u.)")
Plot multiple timeseries in one graph with ts.plot()
in package::graphics
.
This function can handle multiple ts
objects as arguments.
# graphics::ts.plot ts.plot(growth.ac(r = 0.9), growth.ac(r = 1), growth.ac(r = -.8), gpars = list(xlab = "time (a.u.)", ylab = expression(Y(t)), main = expression(paste("Unrestricted Growth: ",Y[t+1]==r*Y[t])), lwd = rep(2,3), lty = c(1:3), col = c("darkred","darkblue","darkgreen") ) ) legend(70, -0.015, c("r = 0.9","r = 1.0", "r = -0.8"), lwd = rep(2,3), lty = c(1:3), col = c("darkred","darkblue","darkgreen"), merge = TRUE)
Use xyplot()
in package::lattice
to create a plot with panels. The easiest way to do this is to create a dataset in so-called "long" format. This means the variable to plot is in 1 column and other variables indicate different levels, or conditions under which the variable was observed or simulated.
Function ldply()
is used to generate $Y$ for three different settings of $r$. The values of $r$ are passed as a list and after a function is applied the result is returned as a dataframe.
require(plyr) # Needed for function ldply() # Create a long format dataframe for various values for `r` data <- ldply(c(0.9,1,-0.8), function(r) cbind.data.frame(Y = as.numeric(growth.ac(r = r)), time = as.numeric(time(growth.ac(r = r))), r = paste0("r = ", r))) # Plot using the formula interface xyplot(Y ~ time | r, data = data, type = "l", main = expression(paste("Unrestricted Growth: ",Y[t+1]==r*Y[t])))
One can also have different panels represent different growth functions.
# Create a long format dataframe for combinations of `type` and `r` param <- list(driving = 1.1, damping = 0.9, logistic = 2.9, vanGeert = 1.9) # Use the `names()` function to pass the `type` string as an argument. data <- ldply(seq_along(param), function(p){ cbind.data.frame(Y = as.numeric(growth.ac(r = param[[p]], type = names(param[p]))), time = as.numeric(time(growth.ac(r = param[[p]], type = names(param[p])))), type = paste0(names(param[p]), " | r = ", param[p])) }) # Plot using the formula interface xyplot(Y ~ time | factor(type), data = data, type = "l", scales = c(relation = "free"), main = "Four Autocatalytic Growth Models")
To create a return plot the values of $Y$ have to be shifted by a certain lag. The functions lead()
and lag()
in package::dplyr
are excellent for this purpose (note that dplyr::lag()
behaves different from stats::lag()
).
# Function lag() and lead() require(dplyr) # Get exponential growth Y1 <- growth.ac(Y0 = .9, r = .9, N = 1000, type = "driving") # Get logistic growth in the chaotic regime Y2 <- growth.ac(r = 4, N = 1000, type = "logistic") # Use the `lag` function from package `dplyr` op <- par(mfrow = c(1,2), pty = "s") plot(lag(Y1), Y1, xy.labels = FALSE, pch = ".", xlim = c(0,1), ylim = c(0,1), xlab = "Y(t)", ylab = "Y(t+1)", main = expression(paste(Y[t+1]==r*Y[t]))) plot(lag(Y2), Y2, xy.labels = FALSE, pch = ".", xlim = c(0,1), ylim = c(0,1), xlab = "Y(t)", ylab = "Y(t+1)", main = expression(paste(Y[t+1]==r*Y[t]*(1-Y[t])))) par(op)
Use l_ply()
from package::plyr
to create return plots with different lags. The l_ before ply means the function will take a list as input to a function, but it will not expect any data to be returned, for example in the case of a function that is used to plot something.
# Explore different lags op <- par(mfrow = c(1,2), pty = "s") l_ply(1:4, function(l) plot(lag(Y2, n = l), Y2, xy.labels = FALSE, pch = ".", xlim = c(0,1), ylim = c(0,1), xlab = "Y(t)", ylab = paste0("Y(t+",l,")"), cex = .8)) par(op)
Matlab
For Matlab
we provide an example of a simple for ... next
loop, which should be easy to translate to R
if you want to.
%%%%%%%%%%%%%% COMPUTING TRAJECTORIES OF THE LOGISTIC MAP %%%%% %% Set these parameters to manipulate the logistic map r = 1,1; % Control parameter value Y0 = 0.01; % Initial condition N = 100; % Number of iterations %% Y = [Y0; NaN(length(1:(N-1)),1)]; % This creates a vector Y of length N % iterate values for t = 1:(N-1) Y(t+1) = r*Y(t); end %% Graphs subplot(2,1,1) % Create a graph the time series figure(1); set(gcf,'Color','white'); plot(Y,'k'); xlabel('Time (discrete)') ylabel('Time Evolution of Y') title([{'Linear Map'},{['Y_0 = ' num2str(Y0) ', r = ' num2str(r)]}]) subplot(2,1,2) % Create a graph the return plot set(gcf,'Color','white'); plot(Y(1:length(Y)-1),Y(2:length(Y)),'.k'); xlabel('Y(t)') ylabel('Y(t+1)') title([{'Return Plot'},{['Y_0 = ' num2str(Y0) ', r = ' num2str(r)]}]) axis square
%%%%%%%%%%%%%% COMPUTING TRAJECTORIES OF THE LOGISTIC MAP %%%%% %% Set these parameters to manipulate the logistic map r = 4; % Control parameter value Y0 = 0.08; % Initial condition N = 100; % Number of iterations %% Y = [Y0; NaN(length(1:(N-1)),1)]; % This creates a vector Y of length N % iterate values for t = 1:(N-1) Y(t+1) = r*Y(t)*(1-Y(t)); end %% Graphs subplot(2,1,1) % Create a graph the time series figure(1); set(gcf,'Color','white'); plot(Y,'k'); xlabel('Time (discrete)') ylabel('Time Evolution of Y') title([{'Logisitc Map'},{['Y_0 = ' num2str(Y0) ', r = ' num2str(r)]}]) subplot(2,1,2) % Create a graph the return plot set(gcf,'Color','white'); plot(Y(1:length(Y)-1),Y(2:length(Y)),'.k'); xlabel('Y(t)') ylabel('Y(t+1)') title([{'Return Plot'},{['Y_0 = ' num2str(Y0) ', r = ' num2str(r)]}]) axis square
R
Different values for r
:
# Parameters rs <- c(1.2, 2.2, 2.5, 2.7, 2.9, 3) # Plot op <- par(mfrow=c(1,2)) l_ply(rs,function(r){plot(growth.ac(r = r, Y0 = 0.01, type = "vanGeert"), ylim = c(0,1.4), ylab = "L(t)", main = paste("r =",r))}) par(op)
Different values for $k$ reveal that the dispersion of values (variance) increases if the carrying capacity increases. This occurs because we are dealing with nonlinear changes to the values of $Y$ and if larger values of $Y$ are allowed by a hihger $k$, these values will be amplified once they occur.
# Parameters ks <- c(0.5, 0.75, 1, 1.5) # Plot op <- par(mfrow=c(1,2)) l_ply(ks,function(k){plot(growth.ac(r = 2.9, k = k, Y0 = 0.01, type = "vanGeert"), ylim = c(0, 2), ylab = "L(t)", main = paste("k =",k))}) par(op)
growth.ac.cond <- function(Y0 = 0.01, r = 0.1, k = 2, cond = cbind.data.frame(Y = 0.2, par = "r", val = 2), N = 100){ # Create a vector Y of length N, which has value Y0 at Y[1] Y <- c(Y0, rep(NA, N-1)) # Iterate N steps of the difference equation with values passed for Y0, k and r. cnt <- 1 for(t in seq_along(Y)){ # Check if the current value of Y is greater than the threshold for the current conditional rule in cond if(Y[t] > cond$Y[cnt]){ # If the threshold is surpassed, change the parameter settings by evaluating: cond$par = cond$val eval(parse(text = paste(cond$par[cnt], "=", cond$val[cnt]))) # Update the counter if there is another conditional rule in cond if(cnt < nrow(cond)){cnt <- cnt + 1} } # Van Geert growth model Y[[t+1]] <- Y[t] * (1 + r - r * Y[t] / k) } return(ts(Y)) } # Plot with the default settings (same as first step in the assignment) xyplot(growth.ac.cond())
The 'trick' used here is to define the function such that it can take a set of conditional rules and apply them sequentially during the iterations. The conditiona rule is passed as a data.frame
, but one could also use a list
object.
(cond <- cbind.data.frame(Y = c(0.2, 0.6), par = c("r", "r"), val = c(0.5, 0.1))) xyplot(growth.ac.cond(cond=cond))
Or, combine a change of r
and a change of k
(cond <- cbind.data.frame(Y = c(0.2, 1.99), par = c("r", "k"), val = c(0.5, 3))) xyplot(growth.ac.cond(cond=cond)) # A fantasy growth process (cond <- cbind.data.frame(Y = c(0.1, 1.99, 1.999, 2.5, 2.9), par = c("r", "k", "r", "r","k"), val = c(0.3, 3, 0.9, 0.1, 1.3))) xyplot(growth.ac.cond(cond=cond))
Somewhat more realstic would be to model a change of r
as dependent on the values of another process. The proper 'dynamical' way to do this would be to define a coupled system of difference or differential equations in which the interaction dynamics regulate growth. An example is the predator-prey system discussed in the next assignment.
Using the 'conditional' rules on a number of seperate processes will 'work' as a model, but it isn't exactly what is meant by interaction dynamics, or multiplicative interactions. Basically, these processes will be independent and non-interacting. The conditional rules that change the parameters are 'given'.
# Generate 3 timeseries Y1 <- growth.ac(k = 2, r =.2, type = "vanGeert") # Y2 and Y3 start at r = 0.001 Y3 <- Y2 <- growth.ac(k = 2, r = 0.001, type = "vanGeert") # Y2 and Y3 start when k is approached c1 <- 1.6 c2 <- 2.2 Y2[Y1 > c1] <- growth.ac(r = .3, k = 3, type = "vanGeert", N = sum(Y1 > c1)) Y3[Y2 > c2] <- growth.ac(r = .5, k = 4, type = "vanGeert", N = sum(Y2 > c2)) # Make a nice plot ts.plot(Y1, Y2, Y3, gpars = list(xlab = "time (a.u.)", ylab = expression(Y(t)), main = expression(paste("'Connected' Growers ",Y[t+1]==Y[t]*(1 + r - r*Y[t]))), lwd = rep(2,3), lty = c(1:3), col = c("darkred","darkblue","darkgreen") ) ) legend(1, 3.8, c("Y1(0): r = .2", paste0("Y2(",which(Y1 > c1)[1],"): r = .3"), paste0("Y3(",which(Y2 > c2)[1],"): r = .5")), lwd = rep(2,3), lty = c(1:3), col = c("darkred","darkblue","darkgreen"), merge = TRUE)
In order to 'solve' a differential equation for time using a method of numerical integration, one could code it like in the spreadsheet assignment. For R
and Matlab
there are so-called solvers available, functions that will do the integration for you. Look at the Examples in package deSolve
.
The result of applying a method of numerical integration is called a numerical solution of the differential equation. The analytical solution is the equation which will give you a value of $Y$ for any point in time, given an initial value $Y_0$. Systems which have an analytical solution can be used to test the accuracy of numerical solutions.
Remember that the analytical solution for the logistic equation is:
$$ \frac{K}{1 + \left(\frac{K}{Y_0 - 1}\right) * e^{-r*t} } $$
We have the function growth.ac()
and could easily adapt all the functions to use Euler's method.
Below is a comparison of the analytic solution with Euler's method.
# Parameter settings d <- 1 N <- 100 r <- .1 k <- 1 Y0 <- 0.01 Y <- as.numeric(c(Y0, rep(NA,N-1))) # Numerical integration of the logistic differential equation Y.euler1 <- ts( sapply(seq_along(Y), function(t) Y[[t+1]] <<- (r * Y[t] * (k - Y[t])) * d + Y[t] )) Y.euler2 <- ts( sapply(seq_along(Y), function(t) Y[[t+1]] <<- (r * Y[t] * (k - Y[t])) * (d+.1) + Y[t] )) ## analytical solution Y.analytic <- ts( k / (1 + (k / Y0 - 1) * exp(-r*(time(Y.euler1)))) ) ts.plot(Y.analytic, Y.euler1, Y.euler2, gpars = list(xlab = "time (a.u.)", ylab = expression(Y(t)), main = expression(paste("Analytic vs. Numerical:",Y[t+1]==Y[t]*(1 + r - r*Y[t]))), lwd = rep(2,3), lty = c(1:3), col = c("darkred","darkblue","darkgreen") ) ) legend(50, 0.4, c("Analytic", "Euler: delta = 1.0", "Euler: delta = 1.1"), lwd = rep(2,3), lty = c(1:3), col = c("darkred","darkblue","darkgreen"), merge = TRUE)
The Euler setup: $$ \begin{align} R_{t+1} &= f_R(R_t,Ft) * \Delta + R_t \ F_{t+1} &= f_F(R_t,F_t) * \Delta + F_t \end{align} $$
With the equations: $$ \begin{align} R_{t+1} &= (a-bF_t)R_t * \Delta + R_t \ \ F_{t+1} &= (cR_t-d)F_t * \Delta + F_t \end{align} $$
# Parameters N <- 1000 a <- d <- 1 b <- c <- 2 R0 <- F0 <- 0.1 R <- as.numeric(c(R0, rep(NA,N-1))) F <- as.numeric(c(F0, rep(NA,N-1))) # Time constant delta <- 0.01 # Numerical integration of the logistic differential equation l_ply(seq_along(R), function(t){ R[[t+1]] <<- (a - b * F[t]) * R[t] * delta + R[t] F[[t+1]] <<- (c * R[t] - d) * F[t] * delta + F[t] }) # Note different behaviour when ts() is applied xyplot(cbind(ts(R),ts(F))) xyplot(R ~ F, pch = 16)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.