In AdrianTimpson/ADMUR: Ancient Demographic Modelling Using Radiocarbon

1. Overview

Introduction to ADMUR

The ADMUR package provides a robust framework to infer population dynamics from radiocarbon and other chronological datasets.

One common approach to modelling popuation dynamics is to build a Summed Probability Distribution (SPD) of radiocarbon dates, and test for statistically significant departures from a null population model using simulations. This tool is provided in ADMUR, but the method is inferior to model comparison approaches also provided in ADMUR, since there is no obvious null hypothesis.

Instead, model comparison methods calculate the probability of the data under various plausible models and their parameter ranges, and select the most probable (least wrong) model and best parameters.

ADMUR provides many different models of population dynamics to compare, from simple parameterised distributions such as the sigmoid or exponential; continuous piece-wise models which (if enough data is available) can accomodate huge complexity and detail; and even an independent timeseries (such as an enviromental reconstruction) that is hypothesised to predict population dynamics.

ADMUR also provides tools to model the timing of an archaeological site-phase, similar to some of the OxCal tools.

Bayesian inference

ADMUR discretises time to calculate model likelihoods analytically, and uses the Bayesian framework to obtain posterior probabilities by combining a prior with the likelihood. However, given the recent fetishization of the word 'Bayesian' in archaeology, it is worth reflecting on exactly what this means.

The Bayesian approach provides a superb framework for justifying knowledge as beliefs that are updated in the light of new evidence. This implies that objective knowledge does not exist. Instead, given a set of observations, each observer can form a different conclusion depending on their prior beliefs.

This makes intuitive sense - a child may believe they have witnessed a victim being sawn in half at a magic show, whilst a parent is unlikely to believe it was real given their strong prior beliefs formed over many years about what is anatomically possible, the concepts of showmanship, and the punishments for murder.

Therefore, the beauty and great power of the Bayesian approach is in providing a method for drawing inferences when there are strong prior beliefs and only a small amount of new data.

Priors (by their very nature) are subjective, whilst the likelihood is not (the likelihood is defined as the probability of the observed data, under the model). In Bayesian inference, the most challenging part is usually formulating the likelihood function. Once obtained as a mathematical function, making the inference Bayesian is trivial. Both the frequentist and Bayesians agree on the likelihood, but in the extreme case of no data, the frequentist cannot provide an estimate, whilst the Bayesian recovers her prior. In the opposite situation where there are a lot of data, the priors are 'washed out' by the new data, and the Bayesian posterior becomes indistinguishable from the frequentist estimate. Both become data-dominated estimates (i.e. dominated by the likelihood).

It is also worth noting that researchers often try to reduce any perceived subectivity by using a uniform prior, whilst simultaneously insisting on using a Bayesian framework. This is a contradiction, since the posterior becomes exactly equal to the normalised likelihood under a uniform prior.

Bayesian inference is perhaps better thought of as a method for integrating several different sources of evidence, since the posterior 'becomes' the prior in the next step. In fact, the words 'prior' and 'posterior' are misnomers - there is no temporal component to these Bayesian chains, and the order in which these sources of evidence are combined with the prior is irrelevant, since multiplication is commutative.

Installation

The ADMUR package can be installed directly from the CRAN in the usual way:

install.packages('ADMUR')

Alternatively it can be installed from GitHub, after installing and loading the 'devtools' package on the CRAN. However, this may be a beta version in development, so may not be as stable:

install.packages('devtools')
library(devtools)
install_github('UCL/ADMUR')

Either way, the ADMUR package can then be locally loaded:

library(ADMUR)

Data

The minimum requirements for a dataset is that it should be a data frame that includes at least the columns 'age' and 'sd'. Additional columns are permitted. ADMUR can also handle other dating types such as thermoluminescence or dendrochronology that are already provided in calendar time. This requires an additional column 'datingType' which should be specified as '14C' for radiocarbon dates. For any other text entry, ADMUR will assume the mean and SD are in calendar time, BP (before 1950 AD). If the column 'datingType' is omitted, ADMUR will assume all the dates are radiocarbon.

Various toy datasets are provided in ADMUR. For example, data6 comprises radiocarbon dates which have already been grouped into archaeological phases, based on the expert stratographical knowledge of the archaeologist. Therefore it also contains the columns 'site' and 'phase'. Site S001 contains only one phase YBX, whilst site S002 containes two phases, CQH and OGG, etc:

data6[1:10,]

2. radiocarbon date calibration and SPDs

Generating a single calibrated date distribution, or a Summed Probability Distribution (SPD) of several calibrated dates, requires either a two-step process to give the user full control of the date range and temporal resolution, or a simpler one step process using a wrapper function that automatically estimates a sensible date range and resolution from the dataset.

With the wrapper

Use the function summedCalibratorWrapper()

data <- data.frame( age=c(6562,7144), sd=c(44,51) )
x <- summedCalibratorWrapper(data)

The function assumes the data provided were all radiocarbon dates. However, if you have other kinds of date such as thermoluminescence you can specify this. Non-radiocarbon types are assumed to be in calendar time, BP. You can also specify a particular calibration curve:

data <- data.frame( age=c(6562,7144), sd=c(44,51), datingType=c('14C','TL') )
x <- summedCalibratorWrapper(data=data, calcurve=shcal20)

Without the wrapper

Generating the SPD without the wrapper gives you more control, and requires a two-step process:

Convert a calibration curve to a CalArray using the function makeCalArray()
Calibrate the radocarbon dates through the CalArray using the function summedCalibrator().

This is useful for improving computational times if generating many SPDs, for example in a simulation framework, since the CalArray needs generating only once.

data <- data.frame( age = c(9144), sd=c(151) )
CalArray <- makeCalArray( calcurve=intcal20, calrange=c(8000,13000) )
cal <- summedCalibrator(data, CalArray)
plotPD(cal)

The CalArray is essentially a two-dimensional probability array of the calibration curve, and can be viewed using the plotCalArray() function. Calibration curves vary in their temporal resolution, and the preferred resolution can be specified using the parameter inc which interpolates the calibration curve. It would become prohibitively time and memory costly if analysing the entire 50,000 year range of the calibration curve at a 1 year resolution (requiring a 50,000 by 50,000 array) and in practice the default 5 year resolution provides equivalent results to 1 year resolution for study periods wider than c.1000 years.

x <- makeCalArray( calcurve=shcal20, calrange=c(5500,7000), inc=1 )
plotCalArray(x)

The image arbitrarily uses only ten shades, but the resolution of the matrix is actually 1 year (inc=1).

Phasing

A naive approach to generating an SPD as a proxy for population dynamics would be to sum all calibrated date dstributions in the dataset. A more sensible approach is to sum the SPDs of each phase. I.e., to generate an SPD for each phase, normalise such that each is equally weighted, then sum again. The need to bin dates into phases is an important step in modelling population dynamics to adjust for the data ascertainment bias of some archaeological finds having more dates by virtue of a larger research interest or budget. Therefore normalising the SPD of each phase has a conservative effect on the overall SPD, since each phase contributes an equal amount of probability mass.

There may be occasions where it is not possible to phase the dates according to a proper evaluation of the stratigraphy. Provided the dates have been assigned to a site, ADMUR provides an automatic binning algorithm to phase the data, based broadly on empirical continuity / disconutinuity of the dates at each site, such that a phase comprises dates with a mean radiocarbon date within 200 radiocarbon years of any other date in that respective phase.

For example, the dataset SAAD has not already been phased (does not include a column 'phase'). We can explore a subset comprising the following 8 dates from 2 sites. After applying the binning algorithm Pacopampa.1 comprises samples 1207 and 1206, Pacopampa.2 comprises sample 1205, Carrizal.1 comprises samples 1196 and 1195 and 1194 and 1193, and Carrizal.2 comprises sample 1192:

The function phaseCalibrator() includes applies the binning algorithm if required, and generates an SPD for each phase in a dataset:

data <- subset( SAAD, site %in% c('Carrizal','Pacopampa') )
data[,2:7]

CalArray <- makeCalArray( calcurve=shcal20, calrange=c(2000,6000) )
x <- phaseCalibrator(data=data, CalArray=CalArray)
plotPD(x)

Finally, the distributions in each phase can be summed and normalised to unity. It is straight forward to achieve this directly from the data frame created above:

SPD <- as.data.frame( rowSums(x) )

# normalise
SPD <- SPD/( sum(SPD) * CalArray$inc )

Alternatively, the wrapper function summedPhaseCalibrator() will perform this entire workflow internally:

SPD <- summedPhaseCalibrator( data=data, calcurve=shcal20, calrange=c(2000,6000) )
plotPD(SPD)

Ideally, data has already been phased based on archaeological expertise, and a proper evaluation of the stratigraphy. No such binning algorithm is then required. The presence of the column 'phase' in the data means the wrapper function summedPhaseCalibrator() will use the provided phases. For example, consider the dataset data6, which has already been phased (already has a column 'phase'):

x <- summedPhaseCalibrator( data=data6, calcurve=shcal20, calrange=c(5000,8000) )
plotPD(x)

3. Modelling populaton dynamics

It is crucial to ensure that population dynamics are only modelled across a time period that we can be confident is well represented by data. Therefore, the start and end date of the period to be modelled is a key decision to be made by the modeller. As a dataset increases in size, the absence of evidence increasingly becomes evidence of absence. For example, absence of dates prior to first arrival on an island; or a hiatus period.

Theoretically a calibrated radiocarbon date or Summed Probability Distribution (SPD) of many radiocarbon dates should be a continuous Probability Density Function (PDF), however in practice they are represented as a discrete vector of probabilities corresponding to each calendar year, and therefore are Probability Mass Function (PMF). This discretisation provides the advantage that analytical methods can be used to easily calculate likelihoods, provided the model is also discretised to the same time points.

The general objective in modelling is to compare several plausible hypotheses, and find which is the least bad. In other words, we define our hypotheses as statistical models, and calculate the probability of the observed data, given a model. This is the model likelihood, given the data. More complex models (more free parameters) are typically able to be fitted more closely to the observed data, but this results in overfitting - a situation where the model only works well with this particular dataset, but performs badly with new data. Therefore the key part of modelling is to fairly compare competing models using a statistic (Information Criterion such as BIC) that balances the likelihood against the model complexity. Models with greater complexity are punished, whilst simpler models that fit the data well are favoured.

When calculating model likelihoods, we must be sensitive to the data problem that some archaeological phases have more radiocarbon dates than others, merely due to research bias and ascertainment bias. Discussed earlier in this manual, we argue that data should first be phased, so that an SPD is generated for each phase and normalised to 1. Therefore, ADMUR calculates the exact probability of each phase's SPD under a model, then multiplies each probability together (sum of logs). As such, model comparison using BIC is calculated using the effective sample size, which is the number of phases within the date range, and not the total number of radiocarbon dates. The rationale here is that it would be very wrong to assume each radiocarbon date is a random independent sample of the overall population, but it is reasonable to assume each radiocarbon date is a random independent sample of the phase it was assigned to, and that each phase is a random independent sample of the overall population.

ADMUR offers three 'types' of model for population dynamics, which essentially provide a hypothesis of the underlying population fluctuations, coded as a PMF. These are all 'truncated' models, because in every case their total PMF is constrained to a date range defined by the user, using the argument 'calrange'. In other words, the user defines the date range she wishes to model, and these models are then truncated to that date range.

Simple parametric models

Typically symmetric or unimodal truncated versions of very common curves. Easy to calculate the PMF with a short formula:

models <- getTruncatedModelChoices()
print(models$names)

See convertPars() for details. Care should be taken when considering a truncated Gaussian model. Data can often superficially appear to be normally distributed due to the tendency to unconsciously apply regression methods (minimising the residuals). However, contrary to appearances (and intuitions) a Gaussian does not 'flatten' towards the tails, but decreases at a greater and greater rate towards zero. As a consequence, small amounts of data that are several standard deviations away from the mean appear to fit a Gaussian quite well, but are in fact absurdly improbable. Instead a truncated Cauchy model may be better, given the phenomenon that real life data usually has fatter tails than a Gaussian.

The following code uses three toy datasets to demonstrate some simple parametric models. After calibration through intcal20, they retain effective sample sizes of a little under n = 100.

# generate SPDs
CalArray <- makeCalArray(intcal20, calrange = c(1000,4000))
spd1 <- summedCalibrator(data1, CalArray, normalise='full')
spd2 <- summedCalibrator(data2, CalArray, normalise='full')
spd3 <- summedCalibrator(data3, CalArray, normalise='full')

# calibrate phases
PD1 <- phaseCalibrator(data1, CalArray, remove.external = TRUE)
PD2 <- phaseCalibrator(data2, CalArray, remove.external = TRUE)
PD3 <- phaseCalibrator(data3, CalArray, remove.external = TRUE)

# effective sample sizes 
print(ncol(PD1))
print(ncol(PD2))
print(ncol(PD3))

# Search for the best fitting model parameters, using a Maximum Likelihood (ML) search.
# We can use any search oroptimisation algorithm.
# For example, JDEoptim from the DEoptimR package uses a very nice evolutionary algorithm. 
# Can be slow, but is very robust with complex multi-dimensional likelihood surfaces:

library(DEoptimR)
norm <- JDEoptim(lower=c(1000,1), upper=c(4000,5000), 
    fn=objectiveFunction, PDarray=PD1, type='norm', NP=40, trace=T)
cauchy <- JDEoptim(lower=c(1000,1), upper=c(4000,5000),
    fn=objectiveFunction, PDarray=PD1, type='cauchy', NP=40, trace=T)
sine <- JDEoptim(lower=c(0,0,0), upper=c(1/1000,2*pi,1),
    fn=objectiveFunction, PDarray=PD2, type='sine', NP=60, trace=T)
logistic <- JDEoptim(lower=c(0,0), upper=c(1,10000),
    fn=objectiveFunction, PDarray=PD3, type='logistic', NP=40, trace=T)
exp <- JDEoptim(lower=c(0), upper=c(1),
    fn=objectiveFunction, PDarray=PD3, type='exp', NP=20, trace=T)
power <- JDEoptim(lower=c(0,-10), upper=c(10000,0),
    fn=objectiveFunction, PDarray=PD3, type='power', NP=40, trace=T)

Note the upper boundaries for the truncated sinewave (see sinewavePDF() for details). The first parameter governs the frequency, so should be constrained by a wavelength no shorter than c. 1/10th of the date range.

Now the Maximum Likelihood parameters need to be converted into a PDF and plotted:

# convert parameters to model PDs
years <- 1000:4000
mod.norm <- convertPars(pars=norm$par, years, type='norm')
mod.cauchy <- convertPars(pars=cauchy$par, years, type='cauchy')
mod.sine <- convertPars(pars=sine$par, years, type='sine')
mod.uniform <- convertPars(pars=NULL, years, type='uniform')
mod.logistic <- convertPars(pars=logistic$par, years, type='logistic')
mod.exp <- convertPars(pars=exp$par, years, type='exp')
mod.power <- convertPars(pars=power$par, years, type='power')

# Plot SPDs and various fitted models:
par(mfrow=c(3,1), mar=c(4,4,1,1))
cols <- c('steelblue','firebrick','orange')

plotPD(spd1)
lines(mod.norm, col=cols[1], lwd=5)
lines(mod.cauchy, col=cols[2], lwd=5)
legend(x=4000, y=max(spd1)*1.2, lwd=5, col=cols, bty='n', legend=c('Gaussian','Cauchy'))

plotPD(spd2)
lines(mod.sine, col=cols[1], lwd=5)
lines(mod.uniform, col=cols[2], lwd=5)
legend(x=4000, y=max(spd2)*1.2,  lwd=5, col=cols, bty='n', legend=c('Sinewave','Uniform'))

plotPD(spd3)
lines(mod.logistic, col=cols[1], lwd=5)
lines(mod.exp, col=cols[2], lwd=5)
lines(mod.power, col=cols[3], lwd=5)
legend(x=4000, y=max(spd3)*1.2, lwd=5, col=cols, bty='n', legend=c('Logistic','Exponential','Power Law'))

Examples of some simple parametric models available in ADMUR {width=680px}

Continuous Piecewise Linear (CPL) Modelling

A CPL model lends itself well to the objectives of identifying specific demographic events. Its parameters are the (x,y) coordinates of the hinge points, which are the relative population size (y) and timing (x) of these events.

A toy() model is provided to demonstrate how this achieved. First, we simulate a plausible radiocarbon dataset and calibrate it. The function simulateCalendarDates() automatically covers a slightly wider date range to ensure simulated radiocarbon dates are well represented at the edges:

set.seed(12345) 
N <- 350

# randomly sample calendar dates from the toy model
cal <- simulateCalendarDates(toy, N)

# Convert to 14C dates. 
age <- uncalibrateCalendarDates(cal, shcal20)
data <- data.frame(age = age, sd = 50, phase = 1:N, datingType = '14C')

# Calibrate each phase, taking care to restrict to the modelled date range with 'remove.external'
CalArray <- makeCalArray(shcal20, calrange = range(toy$year))
PD <- phaseCalibrator(data, CalArray, remove.external = TRUE)

The argument 'remove.external = TRUE' ensures any calibrated phases with less than 50% of their probability mass within the modelled date range are excluded, reducing the effective sample size from 350 to 303. This is a crucial step to avoid mischievous edge effects of dates outside the date range. Similarly, notice we constrained the CalArray to the modelled date range.

These are important to ensure that we only model the population across a range that is well represented by data. To extend the model beyond the range of available data would be to assume the absence of evidence means evidence of absence. No doubt there may be occasions when this is reasonable (for example if modelling the first colonisation of an island that has been well excavated, and the period before arrival is evidenced by the absence of datable material), but more often the range of representative data is due to research interest, and therefore the logic of only including dates with at least 50% of their probability within the date range is that their true dates are more likely to be internal (within the date range) than external.

print( ncol(PD) )

Finally we calculate the overall relative log likelihood of the model using function loglik()

loglik(PD=PD, model=toy)

For comparison, we can calculate the overall relative likelihood of a uniform model given exactly the same data. Intuitively this should have a lower relative likelihood, since our dataset was randomly generated from the non-uniform toy population history:

uniform.model <- convertPars(pars=NA, years=5500:7500, type='uniform')
loglik(PD=PD, model=uniform.model)

And indeed the toy model is thirty nine million trillion times more likely than the uniform model:

exp( loglik(PD=PD, model=toy) - loglik(PD=PD, model=uniform.model) )

Crucially, loglik() calculates the relative likelihoods for each effective sample separately (each phase containing a few dates). The overall model likelihood is the overall product of these individual likelihoods. This means that even in the case where there is no ascertainment bias, each date should still be assigned to its own phase, to ensure phaseCalibrator() calibrates each date separately. In contrast, attempting to calculate a likelihood for a single SPD constructed from the entire dataset would be incorrect, as this would be treating the entire dataset as a single 'average' sample.

The anatomy of a CPL model

The Continuous Piecewise Linear (CPL) model, which is defined by the (x,y) coordinates of its hinge points. However, not all these coordinates are free parameters. If we consider a 3-CPL model, only five of these eight values are free parameters. The x-coordinates of the start and end (5500 BP and 7500 BP) are fixed by the choice of date range. Additionally, one of the y-coordinates must be constrained by the other parameters, since the total probability (area) must equal 1. As a result, an n-CPL model will have 2n-1 free parameters, which can be searched for using any search algorithm.

Illustration of the toy 3-CPL model PD, described using just four coordinate pairs (hinges).

Parameter space: The Area Breaking Process

We use the function convertPars() to map our search parameters to their corresponding PD coordinates. This allows us to propose independent parameter values from a uniform distribution between 0 and 1, and convert them into coordinates that describe a corresponding CPL model PD. This parameter-to-coordinate mapping is achieved using a modified stick breaking Dirichlet process, which we call the Area Breaking Process.

The Dirichlet Process (not to be confused with the Dirichlet distribution) is an algorithm that can break a stick (the x-axis date range) into a desired number of pieces, ensuring all lengths are sampled evenly. The length (proportion) of remaining stick to break is chosen by sampling from the Beta distribution, such that we use the Beta CDF (with $\alpha$ = 1 and $\beta$ = the number of pieces still to be broken) to convert an x-parameter into its equivalent x-coordinate value. We extend this algorithm for use with the CPL model by also converting y-parameters to y-coordinates as follows:

Fix the y-value of the first hinge (H1, x = 5500 BP) to any constant (y = 3 is arbitrarily chosen since the mapping function below gives 3 for an average y-parameter of 0.5).
Use the mapping function $f(y) = (1/(1-y))^2)-1$ to convert all remaining y-parameters (between 0 and 1) to y-values (between 0 and +$\infty$).
Calculate the total area, given the y-values and previously calculated x-coordinates.
Divide y-values by the total area, to give the y-coordinates of the final PDF.

The parameters must be provided as a single vector with an odd length, each between 0 and 1 (y,x,y,x,...y). For example, a randomly generated 6-CPL model will have 11 parameters and 7 hinges:

set.seed(12345)
CPLparsToHinges(pars=runif(11), years=5500:7500)

Note: The Area Breaking Algorithm is a heuristic that ensures all parameter space is explored and therefore the best parameters can found. However, unlike the one-dimensional stick-breaking process, its mapping of random parameters to PD coordinates is not perfectly even, and we welcome ideas for a more elegant algorithm.

Parameter search

Any preferred search algorithm can be used. For example, the JDEoptim function from DEoptimR uses a differential evolution optimisation algorithm that performs very nicely for this application. We recommend increasing the default NP parameter to at least 20 times the number of parameters, and repeating the search to ensure consistency:

library(DEoptimR)
best <- JDEoptim(lower = rep(0,5), 
    upper = rep(1,5), 
    fn = objectiveFunction, 
    PDarray = PD, 
    type = 'CPL', 
    NP = 100,
    trace = TRUE)

load('vignette.3CPL.JDEoptim.best.RData')

CPL <- CPLparsToHinges(pars=best$par, years=5500:7500)
SPD <- summedPhaseCalibrator( data=data, calcurve=shcal20, calrange=c(5500,7500) )
plotPD(SPD)
lines(CPL$year, CPL$pdf, lwd=2, col='firebrick')
legend(x=6300, y=max(CPL$pdf), cex=0.7, lwd=2, col='firebrick', bty='n', legend='best fitted 3-CPL')
text(x=CPL$year, y=CPL$pdf, pos=3, labels=c('H1','H2','H3','H4'))

Credible intervals using MCMC

The ADMUR function mcmc() uses the Metropolis-Hastings algorithm to search joint parameter values of an n-CPL model. In principle the starting parameters do not matter if burn is of an appropriate length, but in practice it is more efficient to start in a sensible place such as the bmaximum likelihood parameters:

chain <- mcmc(PDarray=PD, startPars=best$par, type='CPL', N=100000, burn=2000, thin=5, jumps =0.025)

The acceptance ratio (AR) and raw chain (before burn-in and thinning) can be sanity checked. Ideally we want the AR somewhere in the range 0.3 to 0.5 (this can be tuned with the 'jumps' argument), and the raw chain to resemble 'hairy caterpillars':

print(chain$acceptance.ratio)
par(mfrow=c(3,2), mar=c(4,3,3,1))
col <- 'steelblue'
for(n in 1:5){
    plot(chain$all.pars[,n], type='l', ylim=c(0,1), col=col, xlab='', ylab='', main=paste('par',n))
    }

Single chain of all 5 raw parameters {width=680px}

These parameters can then be converted to the hinge coordinates using the convertPars() function, and their marginal distributions plotted. Note, the maximum likelihood parameters (red lines) are close to peaks of these distributions, but may not exactly match because they are only marginal distributions. Note also the dates of hinge 1 and 2 are fixed at 5500 and 7500, so have no distribution:

hinges <- convertPars(pars=chain$res, years=5500:7500, type='CPL')
par(mfrow=c(3,2), mar=c(4,3,3,1))
c1 <- 'steelblue'
c2 <- 'firebrick'
lwd <- 3
pdf.brk <- seq(0,0.0015, length.out=40)
yr.brk <- seq(5500,7500,length.out=40)
names <- c('Date of H2','Date of H3','PD of H1','PD of H2','PD of H3','PD of H4')
hist(hinges$yr2,border=c1,breaks=yr.brk, main=names[1], xlab='');abline(v=CPL$year[2],col=c2,lwd=lwd)
hist(hinges$yr3, border=c1,breaks=yr.brk, main=names[2], xlab='');abline(v=CPL$year[3],col=c2,lwd=lwd)
hist(hinges$pdf1, border=c1,breaks=pdf.brk, main=names[3], xlab='');abline(v=CPL$pdf[1],col=c2,lwd=lwd)
hist(hinges$pdf2, border=c1,breaks=pdf.brk, main=names[4], xlab='');abline(v=CPL$pdf[2],col=c2,lwd=lwd)
hist(hinges$pdf3, border=c1,breaks=pdf.brk, main=names[5], xlab='');abline(v=CPL$pdf[3],col=c2,lwd=lwd)
hist(hinges$pdf4, border=c1,breaks=pdf.brk, main=names[6], xlab='');abline(v=CPL$pdf[4],col=c2,lwd=lwd)

Marginal distributions after conversion to hinge coordinates. Maximum likelihood (calculated separately) in red. {width=680px}

Some two-dimensional combinations of joint parameters may be preferred, but still these are 2D marginal representations of 5D parameters, again with maximum likelihood in red:

library(scales)
par( mfrow=c(1,2) , mar=c(4,4,1.5,2), cex=0.7 )
plot(hinges$yr2, hinges$pdf2, pch=16, col=alpha(1,0.02), ylim=c(0,0.0005))
points(CPL$year[2], CPL$pdf[2], col='red', pch=16, cex=1.2)
plot(hinges$yr3, hinges$pdf3, pch=16, col=alpha(1,0.02), ylim=c(0,0.0015))
points(CPL$year[3], CPL$pdf[3], col='red', pch=16, cex=1.2)

2D Marginal distributions. Maximum likelihood (calculated separately) in red. {width=680px}

Alternatively, the joint distributions can be visualised by plotting the CPL model for each iteration of the chain, with the maximum likelihood in red:

plot(NULL, xlim=c(7500,5500),ylim=c(0,0.0011), xlab='calBP', ylab='PD', cex=0.7)
for(n in 1:nrow(hinges)){
    x <- c(hinges$yr1[n], hinges$yr2[n], hinges$yr3[n], hinges$yr4[n])
    y <- c(hinges$pdf1[n], hinges$pdf2[n], hinges$pdf3[n], hinges$pdf4[n])
    lines( x, y, col=alpha(1,0.005) )
    }
lines(x=CPL$year, y=CPL$pdf, lwd=2, col=c2)

Joint distributions. Maximum Likelihood (calculated separately) in red. {width=680px}

Relative growth and decline rates

Percentage growth rates per generation provide an intuitive statistic to quantify and compare population changes through time. However there are two key issues to overcome when estimating growth rates for a CPL model.

CPL modelling allows for the possibility of hiatus periods, defined by pieces between hinges with a zero or near zero PD. Conventionally, the percentage decrease from any value to zero is 100%, however the equivalent percentage increase from zero is undefined.
Each section of the CPL is a straight line with a constant gradient. However, a straight line has a constantly changing growth/decline rate.

The first problem is an extreme manifestation of the asymmetry from conventionally reporting change always with respect to the first value. For example, if we consider a population of 80 individuals at time $t_1$, changing to 100 at $t_2$ and to 80 at $t_3$, this would be conventionally described as a 25% increase followed by a 20% decrease. This asymmetry is unintuitive and unhelpful in the context of population change, and instead we use a relative rate which is always calculated with respect to the larger value (e.g., a '20% relative growth' followed by '20% relative decline').

We overcome the second problem by calculating the expected (mean average) rate across the entire linear piece. This is achieved by notionally breaking the line into $N$ equal pieces, such that the coordinates of the ends of the $i^{th}$ piece are $(x_1,y_1)$ and $(x_2,y_2)$. The generational (25 yr) rate $r$ of this $i^{th}$ piece is: $$r_i=100\times exp[\ln(\frac{y_2}{y_1})/\frac{x_1-x_2}{25}]-100$$ and the expected rate across the entire line as $N$ approaches +$\infty$ is: $$\sum_{i=1}^{N}r_i/N$$

For example, a population decline from n=200 to n=160 across 100 years is conventionally considered to have a generational decline rate of $100\times exp[\ln(\frac{160}{200})/\frac{100}{25}]-100$ = 5.426% loss per generation. If partitioned into just $N=2$ equal sections (n=200, n=180, n=160), we require two generational decline rates: $100\times exp[\ln(\frac{180}{200})/\frac{50}{25}]-100$ = 5.132% and $100\times exp[\ln(\frac{160}{180})/\frac{50}{25}]-100$ = 5.719%, giving a mean of 5.425%. As the number of sections $N$ approaches +$\infty$, the mean rate asymptotically approaches 5.507%. The similarity to the conventional rate of 5.426% is because the total percentage loss is small (20%), therefore an exponential curve between n=200 and n=160 is similar to a straight line. In contrast, a huge percentage loss of 99.5% illustrates the importance of calculating the expected growth rate, averaged across the whole line: An exponential curve between n=200 and n=1 across the same 100 years has a decline rate of $100\times exp[\ln(\frac{1}{200})/\frac{100}{25}]-100$ = 73.409% loss per 25 yr generation. Meanwhile a linear model between n=200 and n=1 across the same 100 years has an expected decline rate of 47.835% loss per generation.

The relationship between the conventional rate and relative rate is almost identical for realistic rates of change (c. -10% to +10% per generation):

N <- 1000
x <- cbind(rep(5100,N),rep(5000,N))
y <- cbind(seq(1,100,length.out=N),seq(100,1,length.out=N))
conventional <- 100 * exp(log(y[,2]/y[,1])/((x[,1]-x[,2])/25))-100
relative <- relativeRate(x,y)
plot(conventional, relative, type='l')
rect(-100,-100,c(10,0,-10),c(10,0,-10), lty=2,border='grey')

Combination models

ADMUR can also combine multiple models into a more complex aggregated model. For example, we might hypothesise that an island's population dynamics are driven by the combination of some regular climatic oscillations (perhaps from orbital forcing) and growth to carrying capacity, and therefore wish to use a model that is the combination of sinusoidal and logistic. This can be achieved by using a vector of model types, and combining all the required parameters into a single vector, in their respective order. For example:

# generate SPD
CalArray <- makeCalArray(intcal20, calrange = c(1000,4000))
spd <- summedCalibrator(data4, CalArray, normalise='full')

# calibrate phases
PD <- phaseCalibrator(data4, CalArray, remove.external = TRUE)

# parameter search
combo.sine.logistic <- JDEoptim(lower=c(0,0,0,0,0), upper=c(1/1000,2*pi,1,1,10000),
    fn=objectiveFunction, PDarray=PD, type=c('sine','logistic'), NP=100, trace=T)

# Convert maximum likelihood parameters to a PDF
mod.combo <- convertPars(pars=combo.sine.logistic$par, years, type=c('sine','logistic'))

# Plot the SPD and the model
plotPD(spd)
lines(mod.combo, col=cols[1], lwd=5)
legend(x=4000, y=max(spd)*1.2,  lwd=5, col=cols[1], bty='n', legend=c('combo.sine.logistic'))

Example of a combination model {width=680px}

4. Taphonomy

Taphonomic loss has an important influence on the amount of datable material that can be recovered, with the obvious bias that older material is less likely to survive. This means that if a constant population deposited material at a constant rate, we should expect the archaeological record to show an increase in dates towards the present, rather than a uniform distribution of dates.

This taphonomic loss rate has been estimated by Surovell et al. and Bluhm and Surovell who make a compelling argument that a power function $a(x+b)^c$ provides a useful model of taphonomic loss through time ($x$), providing not only a good statistical fit to empirical data, but is also consistent with the mechanism that datable material is subject to greater initial environmental degradation when first deposited on the ground surface compared to the increasing protection through time as it becomes cocooned from these forces.

However, there are two important issues to consider when modelling taphonomy:

There is substantial uncertainty regarding the values of the parameters that determine the shape of the power function. We should expect different taphonomic rates in different locations due to variation in environmental and geological conditions. Indeed these studies have estimated different parameter values for the two datasets used.
There is a common misunderstanding that the taphonomic curve can be used to 'adjust' or 'correct' the data or an SPD to generate a more faithful representation of the true population dynamics. In fact the inclusion of taphonomy is achieved with additional appropriate model parameters, resulting in a more complex model. Whether or not this greater complexity is justified is moot - the decision to include or exclude cannot be resolved with model comparison and should be justified with an independent argument. When comparing models using BIC, all should be consistent in either including or excluding taphonomic curve parameters.

Taphonomic curve parameters

The above studies use regression methods to estimate the taphonomic curve parameters. These methods don't incorporate the full information of the calibrated 14C dates (instead a point estimate is used for each date), and therefore are not based on likelihoods. Nor do they provide confidence intervals for the curve parameters. Finally, the parameter $a$ is unnecessary for the purposes of population modelling, since we are not interested in estimating the absolute loss in material, merely the relative loss through time. Therefore we can consider the taphonomic curve as a PDF such that the total area across the study period equals 1. This results in the following formula, where $x$ is time, and $x_{min}$ and $x_{max}$ are the time boundaries of the study period: $$\frac{(c+1)(b+x)^c}{(b+x_{max})^{(c+1)} - (b+x_{min})^{(c+1)}}$$

This defines ADMUR's power model PDF which we apply within the MCMC framework to estimate the joint parameter distributions of $b$ and $c$ from the same two datasets used in the above studies, constraining the study period (as Bluhm and Surovell did) to between 1kyr and 40kyr BP as follows:

# generate an PD array for each dataset
years <- seq(1000,40000,by=50)
CalArray <- makeCalArray(intcal20, calrange = c(1000,40000),inc=50)
PD1 <- phaseCalibrator(bryson1848, CalArray, remove.external = TRUE)
PD2 <- phaseCalibrator(bluhm2421, CalArray, remove.external = TRUE)

# MCMC search
chain.bryson <- mcmc(PDarray=PD1, 
        startPars=c(10000,-1.5), 
        type='power', N=50000, 
        burn=2000, 
        thin=5, 
        jumps =c(250,0.075)) 

chain.bluhm <- mcmc(PDarray=PD2, 
    startPars=c(10000,-1.5), 
    type='power', N=50000, 
    burn=2000, 
    thin=5, 
    jumps =c(250,0.075))

# convert parameters to taphonomy curves
curve.bryson <- convertPars(chain.bryson$res, type='power', years=years)
curve.bluhm <- convertPars(chain.bluhm$res, type='power', years=years)

# plot
plot(NULL, xlim=c(0,12000),ylim=c(-2.5,-1), xlab='parameter b', ylab='parameter c')
points(chain.bryson$res, col=cols[1])
points(chain.bluhm$res, col=cols[2])

plot(NULL, xlim=c(0,40000),ylim=c(0,0.00025), xlab='yrs BP', ylab='PD')
N <- nrow(chain.bryson$res)
for(n in sample(1:N,size=1000)){
    lines(years,curve.bryson[n,], col=cols[1])
    lines(years,curve.bluhm[n,], col=cols[2])
    }

Joint taphonomic parameter estimates (and equivalent curves) from the MCMC chain generated in ADMUR using datasets from Surovell et al. 2009 and Bluhm and Surovell 2018 {width=680px}

Clearly the taphonomic parameters $b$ and $c$ are highly correlated, and although the curves superficially appear very similar, the parameters differ significantly between the two datasets.

Including taphonomy in a model

Maximum likelihood Search

Taphonomy can be included in any ADMUR model by including 'power' model parameters (taphonomic parameters $b$ and $c$). We suggest constraining these parameters to $0 < b < 20000$ and $-3 < c < 0$, but if there is better prior knowledge of this range (perhaps an independent dataset based on volcanic eruptions for the same study area) then this can be further constrained accordingly.

For example, we might perform a search using the previously generated PD array, to find the best 3-CPL model with and without taphonomy as follows:

best <- JDEoptim(lower=c(0,0,0,0,0),
    upper=c(1,1,1,1,1), 
    fn=objectiveFunction, 
    PDarray=PD, 
    type='CPL',
    trace=T, 
    NP=100) 

best.taph <- JDEoptim(lower=c(0,0,0,0,0,0,-3), 
    upper=c(1,1,1,1,1,20000,0), 
    fn=objectiveFunction, 
    PDarray=PD, 
    type=c('CPL','power'),
    trace=T, 
    NP=140)

These parameters can then be converted to model PDFs and plotted:

load('vignette.3CPL.JDEoptim.best.RData')
load('vignette.3CPL.JDEoptim.best.taph.RData')

CPL <- convertPars(pars=best$par, years=5500:7500, type='CPL')
CPL.taph <- convertPars(pars=best.taph$par, years=5500:7500, type=c('CPL','power'))

SPD <- summedPhaseCalibrator( data=data5, calcurve=shcal20, calrange=c(5500,7500) )
plotPD(SPD)
cols <- c('firebrick','orchid2')
lines(CPL$year, CPL$pdf, lwd=2, col=cols[1])
lines(CPL.taph$year, CPL.taph$pdf, lwd=2, col=cols[2])
legend(x=6300,y=0.001,legend=c('3-CPL','3-CPL with taphonomy'),bty='n',col=cols,lwd=2,cex=0.7)

The above 3-CPL with taphonomy model represents a conflation of two model components: the population dynamics and the taphonomic loss. Instead we are interested in separating these components:

pop <- convertPars(pars=best.taph$par[1:5], years=5500:7500, type='CPL')
taph <- convertPars(pars=best.taph$par[6:7], years=5500:7500, type='power')
plotPD(pop)
title('Population dynamics')
plotPD(taph)
title('Taphonomic loss')

Finally the hinge coordinates of the population dynamics can be extracted:

CPLparsToHinges(pars=best.taph$par[1:5], years=5500:7500)

MCMC search for credible intervals

We should always be cautious of assigning too much importance to point estimates. Smaller sample sizes will always result in larger uncertainties, and it is always better to estimate the plausible range of results. This is of particular concern with taphonomic parameters since the reanalysis of the volcanic datasets above illustrates how a large range of parameter combinations provide very similar taphonomic curves. Furthermore, when including taphonomy in the model, the taphonomic parameters have the potential to interact with the population dynamics parameters in vastly more parameter combinations that will give in many different combinations of similar overall radiocarbon date distributions. Therefore we can perform an MCMC parameter search as follows:

chain.taph <- mcmc(PDarray = PD, 
    startPars = c(0.5,0.5,0.5,0.5,0.5,10000,-1.5), 
    type=c('CPL','power'), 
    N = 30000, 
    burn = 2000, 
    thin = 5, 
    jumps = 0.025)

These can then be separated into population dynamics parameters and taphonomic parameters for either direct plotting, or converted to model PDFs and plotted:

# convert parameters into model PDFs
pop <- convertPars(pars=chain.taph$res[,1:5], years=5500:7500, type='CPL')
taph <- convertPars(pars=chain.taph$res[,6:7], years=seq(1000,30000,by=50), type='power')

# plot population dynamics PDF
plot(NULL, xlim=c(7500,5500),ylim=c(0,0.0013), xlab='calBP', ylab='PD', las=1)
for(n in 1:nrow(pop))lines(5500:7500, pop[n,],col=alpha(1,0.05))

# plot taphonomy PDF
plot(NULL, xlim=c(30000,0),ylim=c(0,0.00025), xlab='calBP', ylab='PD',las=1,)
for(n in 1:nrow(taph))lines(seq(1000,30000,by=50), taph[n,],col=alpha(1,0.02))

# plot taphonomic parameters
plot(NULL, xlim=c(0,20000),ylim=c(-3,0), xlab='parameter b', ylab='parameter c',las=1)
for(n in 1:nrow(chain.taph$res))points(chain.taph$res[n,6], chain.taph$res[n,7],col=alpha(1,0.2),pch=20)

Joint distributions of population dynamics only. {width=680px}

Joint distributions of taphonomy only. Clearly there is not enough information content in such a small toy dataset to narrow the taphonomic parameters better than the initial prior constraints {width=680px}

5. Statistical Inference

Model selection using BIC

A fundamentally important issue in modelling is the need to avoid overfitting an unjustifiably complex model to data, by using a formal model selection approach. In the example above we arbitrarily chose a 3-CPL model to fit to the data (since the data was randomly sampled from a 3-CPL toy population), however, given the small sample size (n = 303) it is possible a simpler model may have better predictive power. ADMUR achieves this using the so-called Bayesian Information Criterion (BIC) aka Schwarz Information Criterion, which balances the model likelihood against the number of parameters and sample size.

Therefore we should also find the Maximum Likelihood for other plausible models such as a 4-CPL, 2-CPL, 1-CPL, exponential and even a uniform:

# CPL parameters must be between 0 and 1, and an odd length.
CPL.1 <- JDEoptim(lower=0, upper=1, fn=objectiveFunction, PDarray=PD, type='CPL', NP=20)
CPL.2 <- JDEoptim(lower=rep(0,3), upper=rep(1,3), fn=objectiveFunction, PDarray=PD, type='CPL', NP=60)
CPL.3 <- JDEoptim(lower=rep(0,5), upper=rep(1,5), fn=objectiveFunction, PDarray=PD, type='CPL', NP=100)
CPL.4 <- JDEoptim(lower=rep(0,7), upper=rep(1,7), fn=objectiveFunction, PDarray=PD, type='CPL', NP=140)

# exponential has a single parameter, which can be negative (decay).
exp <- JDEoptim(lower=-0.01, upper=0.01, fn=objectiveFunction, PDarray=PD, type='exp', NP=20)

# uniform has no parameters so a search is not required.
uniform <- objectiveFunction(NA, PD, type='uniform')

load('vignette.model.comparison.RData')

The objective function returns the negative log-likelihood since the search algorithm seeks to minimise the objective function. It is therefore trivial to extract the log-likelihoods, and calculate the BIC scores using the formula $BIC=k\ln(n)-2L$ where $k$ is the number of parameters, $n$ is the effective sample size (i.e. the number of phases = 303), and $L$ is the maximum log-likelihood.

# likelihoods
data.frame(L1= -CPL.1$value,
    L2= -CPL.2$value,
    L3= -CPL.3$value,
    L4= -CPL.4$value,
    Lexp= -exp$value,
    Lunif= -uniform)
BIC.1 <- 1*log(303) - 2*(-CPL.1$value)
BIC.2 <- 3*log(303) - 2*(-CPL.2$value)
BIC.3 <- 5*log(303) - 2*(-CPL.3$value)
BIC.4 <- 7*log(303) - 2*(-CPL.4$value)
BIC.exp <- 1*log(303) - 2*(-exp$value)
BIC.uniform <- 0 - 2*(-uniform)
data.frame(BIC.1,BIC.2,BIC.3,BIC.4,BIC.exp,BIC.uniform)

Clearly the 4-CPL has the highest likelihood, however the 3-CPL model has the lowest BIC and is selected as the best. This tells us that the 4-CPL is overfitted to the data and is unjustifiably complex, whilst the other models are underfitted and lack explanatory power. Nevertheless for comparison we can plot all the competing models, illustrating that the 4-CPL fits the closest, but cannot warn us that it is overfit:

# convert parameters to model PDs
CPL1 <- convertPars(pars=CPL.1$par, years=5500:7500, type='CPL') 
CPL2 <- convertPars(pars=CPL.2$par, years=5500:7500, type='CPL')  
CPL3 <- convertPars(pars=CPL.3$par, years=5500:7500, type='CPL')  
CPL4 <- convertPars(pars=CPL.4$par, years=5500:7500, type='CPL')  
EXP <- convertPars(pars=exp$par, years=5500:7500, type='exp')  

# Plot SPD and five competing models:
plotPD(SPD)
cols <- c('firebrick','orchid2','coral2','steelblue','goldenrod3')
lines(CPL1$year, CPL1$pdf, col=cols[1], lwd=2)
lines(CPL2$year, CPL2$pdf, col=cols[2], lwd=2)
lines(CPL3$year, CPL3$pdf, col=cols[3], lwd=2)
lines(CPL4$year, CPL4$pdf, col=cols[4], lwd=2)
lines(EXP$year, EXP$pdf, col=cols[5], lwd=2)
legend <- c('1-CPL','2-CPL','3-CPL','4-CPL','exponential')
legend(x=6300, y=max(CPL$pdf), cex=0.7, lwd=2, col=cols, bty='n', legend=legend)

Goodness of fit (GOF) test

it is crucial to test if the selected model is plausible, or in other words, to test if the observed data is a reasonable outcome of the model. If the observed data is highly unlikely the model must be rejected, even if it was the best model selected.

Typically a GOF quantifies how unusual it would be for the observed data to be generated by the model. Of course the probability of any particular dataset being generated by any particular model is vanishingly small, so instead we estimate how probable it is for the model to produce the observed data, or data that are more extreme. This is the same concept as the p-value, but instead of using a null hypothesis we use the best selected model. We can generate many simulated datasets under this model, and calculate a summary statistic for each simulation. A one-tailed test will then establish the proportion of simulations that have a poorer summary statistic (more extreme) than the observed data's summary statistic.

For each dataset (simulated and observed) we generate an SPD and use a statistic that measures how divergent each SPD is from expectation, by calculating the proportion of the SPD that sits outside the 95% CI.

summary <- SPDsimulationTest(data, calcurve=shcal20, calrange=c(5500,7500), pars=CPL.3$par, type='CPL')

The test provides a p-value of 1.00 for the best model (3-CPL), since all of the 20,000 simulated SPDs were as or more extreme than the observed SPD, providing a sanity check that the data cannot be rejected under this model, and therefore is a plausible model:

load('vignette.3CPL.SPDsimulationTest.RData')

print(summary$pvalue)
hist(summary$simulated.stat, main='Summary statistic', xlab='')
abline(v=summary$observed.stat, col='red')
legend(0.3,6000, bty='n', lwd=c(1,3), col=c('red','grey'), legend=c('observed','simulated'))

SPD simulation testing

The previous section provided a framework to directly select the best model given a dataset. This contrasts with the SPD simulation methodology which requires the researcher to a priori specify a single null model, then generate many simulated datasets under this null model which are compared with the observed dataset to generate a p-value. Without the model selection framework, the SPD simulation approach alone has several inferential shortcomings:

Recent studies increasingly suggest population fluctuations are ubiquitous thought history, rendering the application of a null model inappropriate. In contrast this new model selection framework allows any number of models to be compared.
A low p-value merely allows us to reject (or fail to reject) the tested model, but does not provide us with a plausible alternative explanation. This leaves an inferential vacuum in which it is common for researchers to assign colourful demographic narratives to periods outside the 95% CI, which are not directly supported by the test statistic. Instead the CPL framework provides a single best explanation.
Fitting the null model (and therefore estimating its parameters) is commonly achieved by discretising the SPD, then incorrectly assuming these points somehow represent data points, to which the null model is fitted my minimising their residuals. In contrast, the CPL framework correctly calculates the relative likelihood of the proposed model parameters given the data, and therefore can correctly fit a model.

Nevertheless, the p-value from the SPD simulation framework is hugely useful in providing a Goodness of Fit test for the best selected model. Therefore the summary generated in the section 'Goodness of fit test' by the SPDsimulationTest() function provides a number of other useful outputs that can be plotted, including:

pvalue

the proportion of N simulated SPDs that have more points outside the 95%CI than the observed SPD has.

observed.stat

the summary statistic for the observed data (number of points outside the 95% CI).

simulated.stat

a vector of summary statistics (number of points outside the 95% CI), one for each simulated SPD.

n.dates.all

the total number of date in the whole data set. Trivially, the number of rows in data.

n.dates.effective

the effective number of dates within the date range. Will be non-integer since a proportion of some dates will be outside the date range.

n.phases.all

the total number of phases in the whole data set.

n.phases.effective

the effective number of phases within the date range. Will be non-integer since a proportion of some phases will be outside the date range.

n.phases.internal

an integer subset of n.phases.all that have more than 50% of their total probability mass within the date range.

timeseries

a data frame containing the following:

several vectors of various Confidence Intervals.

calBP

a vector of calendar years BP.

expected.sim

a vector of the expected simulation (mean average of all N simulations).

local.sd

a vector of the local (each year) standard deviation of all N simulations.

model

a vector of the model PDF.

SPD

a vector of the observed SPD PDF, generated from data.

index

a vector of -1,0,+1 corresponding to the SPD points that are above, within or below the 95% CI of all N simulations.

summary <- SPDsimulationTest(data, calcurve=shcal20, calrange=c(5500,7500), pars=exp$par, type='exp')

The function plotSimulationSummary() then represents these summary results in a single plot:

load('vignette.exp.SPDsimulationTest.RData')

plotSimulationSummary(summary, legend.y=0.0012)

6. Modelling phase ranges

new section??

7. Comparison with other calibration software

It is worth noting that the algorithm used by this package to calibrate ^14^C dates gives practically equivalent results to those from OxCal generated using oxcAAR and Bchron

Comparison of calibration software for the ^14^C date: 3000 +/- 70 BP calibrated through intcal20.

However, there are two fringe circumstances where these software programs differ substantially: at the border of the calibration curve; and if a date has a large error.

Edge effects

Consider the real ^14^C date [MAMS-13035] https://doi.org/10.1016/j.aeae.2015.11.003 age: 50524 +/- 833 BP calibrated through intcal13, which only extends to 46401BP. Bchron throws an error, whilst OxCal applies a one-to-one mapping between Conventional Radiocarbon (CRA) time and calendar time for any date (mean) beyond the range of the calibration curve. The latter is in theory a reasonable way to mitigate the problem, however OxCal applies this in a binary manner that can create peculiarities. Instead ADMUR gradually fades the calibration curve to a one-to-one mapping between the end of the curve and 60,000 BP.

Comparison of calibration software at the limits of intcal13. OxCal and Bchron produce a truncated distribution for date C. Bchron cannot calibrate date D, and OxCal suggests date D is younger than dates A, B and C. ADMUR performs a soft fade at the limit of the calibration curve.

Large errors

A ^14^C date is typically reported as a mean date with an error, which is often interpreted as representing a symmetric Gaussian distribution before calibration. However, a Gaussian has a non-zero probability at all possible years (between -$\infty$ and +$\infty$), and therefore cannot fairly represent the date uncertainty which must be skewed towards the past. Specifically, if we consider the date in CRA time, it must have a zero probability of occurring in the future. Alternatively, if we consider the date as a ^14^C/^12^C ratio, it cannot be smaller than 1 (the present). Therefore ADMUR assumes a ^14^C date error is lognormally distributed with a mean equal to the CRA date, and a variance equal to the CRA error squared. This naturally skews the distribution away from the present. In practice, this difference is undetectably trivial for typical radiocarbon errors since the lognormal distribution approximates a normal distribution away from zero. However, theoretically the differences can be large if considering dates with large errors that are close to the present.

8. Citations

Citations are available as follows:

citation('ADMUR')

{width=680px}

AdrianTimpson/ADMUR documentation built on Oct. 21, 2024, 5:40 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

AdrianTimpson/ADMUR
Ancient Demographic Modelling Using Radiocarbon

In AdrianTimpson/ADMUR: Ancient Demographic Modelling Using Radiocarbon

1. Overview

Introduction to ADMUR

Bayesian inference

Installation

Data

2. radiocarbon date calibration and SPDs

With the wrapper

Without the wrapper

Phasing

3. Modelling populaton dynamics

Simple parametric models

Continuous Piecewise Linear (CPL) Modelling

The anatomy of a CPL model

Parameter space: The Area Breaking Process

Parameter search

Credible intervals using MCMC

Relative growth and decline rates

Combination models

4. Taphonomy

Taphonomic curve parameters

Including taphonomy in a model

Maximum likelihood Search

MCMC search for credible intervals

5. Statistical Inference

Model selection using BIC

Goodness of fit (GOF) test

SPD simulation testing

6. Modelling phase ranges

7. Comparison with other calibration software

Edge effects

Large errors

8. Citations

R Package Documentation

Browse R Packages

We want your feedback!

AdrianTimpson/ADMUR Ancient Demographic Modelling Using Radiocarbon

In AdrianTimpson/ADMUR: Ancient Demographic Modelling Using Radiocarbon

1. Overview

Introduction to ADMUR

Bayesian inference

Installation

Data

2. radiocarbon date calibration and SPDs

With the wrapper

Without the wrapper

Phasing

3. Modelling populaton dynamics

Simple parametric models

Continuous Piecewise Linear (CPL) Modelling

The anatomy of a CPL model

Parameter space: The Area Breaking Process

Parameter search

Credible intervals using MCMC

Relative growth and decline rates

Combination models

4. Taphonomy

Taphonomic curve parameters

Including taphonomy in a model

Maximum likelihood Search

MCMC search for credible intervals

5. Statistical Inference

Model selection using BIC

Goodness of fit (GOF) test

SPD simulation testing

6. Modelling phase ranges

7. Comparison with other calibration software

Edge effects

Large errors

8. Citations

R Package Documentation

Browse R Packages

We want your feedback!

AdrianTimpson/ADMUR
Ancient Demographic Modelling Using Radiocarbon