rm(list = ls()) knitr::opts_chunk$set(comment = "") options(width = 120, max.print = 100) library(BayLum)
'BayLum'
provides a collection of various R functions for Bayesian analysis of luminescence data.
Amongst others, this includes data import, export, application of age models and palaeodose modelling.
Data can be processed simultaneously for various samples, including the input of multiple BIN/BINX-files per sample for single grain (SG) or multi-grain (MG) OSL measurements. Stratigraphic constraints and systematic errors can be added to constrain the analysis further.
For those who already know how to use R, 'BayLum'
won't be difficult to use,
for all others, this brief introduction may be of help to make the first steps with R and
the package 'BayLum'
as convenient as possible.
If you read this document before having installed R itself, you should first visit the R project website and download and install R. You may also consider installing Rstudio, which provides an excellent desktop working environment for R; however it is not a prerequisite.
You will also need the external software JAGS (Just Another Gibs Sampler). Please visit the JAGS webpage and follow the installation instructions. Now you are nearly ready to work with 'BayLum'.
If you have not yet installed 'BayLum', please run the following two R code lines to install 'BayLum' on your computer.
if(!require("BayLum")) install.packages("BayLum", dependencies = TRUE)
Alternatively, you can load an already installed R package (here 'BayLum') into your session by using the following R call.
library(BayLum)
Measurement data can be imported using two different options as detailed in the following:
Let us consider the sample named samp1, which is the example dataset coming with the package.
All information related to this sample is stored in a subfolder called also samp1.
To test the package example, first, we add the path of the example
dataset to the object path
.
path <- paste0(system.file("extdata/", package = "BayLum"), "/")
Please note that for your own dataset (i.e. not included in the package) you have to replace this call by something like:
path <- "Users/Master_of_luminescence/Documents/MyFamousOSLData"
In our example the folder contains the following subfolders and files:
knitr::kable(list.files(path, recursive = TRUE), col.names = "", row.names = TRUE)
See "What are the required files in each subfolder?" in the manual of Generate_DataFile()
function
for the meaning of these files.
To import your data, simply call the function Generate_DataFile()
:
DATA1 <- Generate_DataFile( Path = path, FolderNames = "samp1", Nb_sample = 1, verbose = FALSE)
The import may take a while, in particular for large BIN/BINX-files. This can become annoying if you want to play with the data. In such situations, it makes sense to save your imported data somewhere else before continuing.
To save the obove imported data on your hardrive use
save(DATA1, file = "YourPath/DATA1.RData")
To load the data use
load(DATA1, file = "YourPath/DATA1.RData")
To see the overall structure of the data generated from the BIN/BINX-file and the associated CSV-files, the following call can be used:
str(DATA1)
It reveals that DATA1
is basically a list with r length(DATA1)
elements:
Element | Content |
---------------------- |---------------
DATA1$LT
| $L_x$/$T_x$ values from each sample |
DATA1$sLT
| $L_x$/$T_x$ error values from each sample |
DATA1$ITimes
| Irradiation times |
DATA1$dLab
| The lab dose rate |
DATA1$ddot_env
| The environmental dose rate and its variance |
DATA1$regDose
| The regenerated dose points |
DATA1$J
| The number of aliquots selected for each BIN-file |
DATA1$K
| The number of regenerated dose points |
DATA1$Nb_measurement
| The number of measurements per BIN-file |
To get an impression on how your data look like, you can visualise them by
using the function LT_RegenDose()
:
LT_RegenDose( DATA = DATA1, Path = path, FolderNames = "samp1", SampleNames = "samp1", Nb_sample = 1, nrow = NULL )
Note that here we consider only one sample, and the name of the folder is the name of the sample.
For that reason the argumetns were set to FolderNames = samp1
and SampleNames = samp1
.
For a multi-grain OSL measurements, instead of Generate_DataFile()
, the function Generate_DataFile_MG()
should
be used with similar parameters. The functions differ by their expectations: Disc.csv instead of DiscPos.csv file for Single-grain OSL Measurements.
Please check type ?Generate_DataFile_MG
for further information.
create_DataFile()
With 'BayLum'
>= v0.3.2 we introduced a new function called create_DataFile()
, which
will at some point in time replace the function Generate_DataFile()
and Generate_DataFile_MG()
.
create_DataFile()
works conceptionally very different from the approach detailed above.
Key differences are:
create_DataFile()
, but also outside of the function
and then passed on the functions. This enables the possibility of extensive pre-processing
and selection of measurement data. The configuration follows the so-called YAML format specification. For single sample the file looks as follows:
- sample: "samp1" files: null settings: dose_source: { value: 0.1535, error: 0.00005891 } dose_env: { value: 2.512, error: 0.05626 } rules: beginSignal: 6 endSignal: 8 beginBackground: 50 endBackground: 55 beginTest: 6 endTest: 8 beginTestBackground: 50 endTestBackground: 55 inflatePercent: 0.027 nbOfLastCycleToRemove: 1
In the case above, the configuration file assumes that data for samp1
are
already imported and treated and a R object called samp1
is available in the global
environment. The following example code reproduces this case:
## get example file path from package yaml_file <- system.file("extdata/example.yml", package = "BayLum") samp1_file <- system.file("extdata/samp1/bin.bin", package = "BayLum") ## read YAML manually and select only the first record config_file <- yaml::read_yaml(yaml_file)[[1]] ## import BIN/BINX files and select position 2 and grain 32 only samp1 <- Luminescence::read_BIN2R(samp1_file, verbose = FALSE) |> subset(POSITION == 2 & GRAIN == 32) ## create the data file DATA1 <- create_DataFile(config_file, verbose = FALSE)
To compute the age of the sample samp1, you can run the following code:
Age <- Age_Computation( DATA = DATA1, SampleName = "samp1", PriorAge = c(10, 100), distribution = "cauchy", LIN_fit = TRUE, Origin_fit = FALSE, Iter = 10000 )
This also works if DATA1
is the output of Generate_DataFile_MG()
.
If MCMC trajectories did not converge, you can add more iteration with the parameter Iter
in the function Age_Computation()
,
for example Iter = 20000
or Iter = 50000
. If it is not desirable to re-run the model from scratch, read the
To increase the precision of prior distribution, if not specified before you can use the argument PriorAge
.
For example: PriorAge= c(0.01,10)
for a young sample and PriorAge = c(10,100)
for an old sample.
If the trajectories are still not convergering, you should whether the choice you made with the argument distribution
and
dose-response curves are meaningful.
LIN_fit
and Origin_fit
, dose-response curves optionLIN_fit
and Origin_fit
in the function.distribution
, equivalent dose dispersion optionBy default, a cauchy distribution is assumed, but you can choose another distribution by replacing the
word cauchy
by gaussian
, lognormal_A
or lognormal_M
for the argument distribution
.
The difference between the models: lognormal_A and lognormal_M is that the equivalent dose dispersion are distributed according to:
SavePdf
and SaveEstimates
optionThese two arguments allow to save the results to files.
SavePdf = TRUE
create a PDF-file with MCMC trajectories of parameters A
(age), D
(palaeodose), sD
(equivalent doses dispersion).
You have to specify OutputFileName
and OutputFilePath
to define name and path of the PDF-file.
SaveEstimates = TRUE
saves a CSV-file containing the Bayes estimates, the credible interval at 68\% and 95\% and the Gelman and Rudin test of
convergence of the parameters A
, D
, sD
. For the export the arguments OutputTableName
and OutputTablePath
have to be specified.
PriorAge
optionBy default, an age between 0.01 ka and 100 ka is expected.
If the user has more informations on the sample, PriorAge
should be modified accordingly.
For example, if you know that the sample is an older, you can set PriorAge=c(10,120)
. In contrast, if you know that the sample is younger,
you may want to set PriorAge=c(0.001,10)
. Ages of $<=0$ are not possible. The minimum bound is 0.001.
Please note that the setting of PriorAge
is not trivial, wrongly set boundaries are likely biasing
your results.
In the previous example we considered only the simplest case: one sample, and one BIN/BINX-file.
However, 'BayLum' allows to process multiple BIN/BINX-files for one sample.
To work with multiple BIN/BINX-files, the names of the subfolders need to beset in argument Names
and
both files need to be located unter the same Path
.
For the case
Names <- c("samp1", "samp2")
the call Generate_DataFile()
(or Generate_DataFile_MG()
) becomes as follows:
##argument setting nbsample <- 1 nbbinfile <- length(Names) Binpersample <- c(length(Names)) ##call data file generator DATA_BF <- Generate_DataFile( Path = path, FolderNames = Names, Nb_sample = nbsample, Nb_binfile = nbbinfile, BinPerSample = Binpersample, verbose = FALSE ) ##calculate the age Age <- Age_Computation( DATA = DATA_BF, SampleName = Names, BinPerSample = Binpersample )
The function Generate_DataFile()
(or Generate_DataFile_MF()
) can process multiple files
simultaneously including multiple BIN/BINX-files per sample.
We assume that we are interested in two samples named: sample1 and sample2. In addition, we have two BIN/BINX-files for the first sample named: sample1-1 and sample1-2, and one BIN-file for the 2nd sample named sample2-1. In such case, we need three subfolders named sample1-1, sample1-2 and sample2-1; which each subfolder containing only one BIN-file named bin.bin, and its associated files DiscPos.csv, DoseEnv.csv, DoseSourve.csv and rule.csv. All of these 3 subfolders must be located in path.
To fill the argument corectly BinPerSample
: $binpersample=c(\underbrace{2}{\text{sample 1: 2 bin files}},\underbrace{1}{\text{sample 2: 1 bin file}})$
Names <- c("sample1-1", "sample1-2", "sample2-1") # give the name of the folder datat nbsample <- 2 # give the number of samples nbbinfile <- 3 # give the number of bin files DATA <- Generate_DataFile( Path = path, FolderNames = Names, Nb_sample = nbsample, Nb_binfile = nbbinfile, BinPerSample = binpersample )
combine_DataFiles()
If the user has already saved informations imported with Generate_DataFile()
function (or Generate_DataFile_MG()
function)
these data can be concatenate with the function combine_DataFiles()
.
For example, if DATA1
is the output of sample named "GDB3", and DATA2
is the output of sample "GDB5",
both data can be merged with the following call:
data("DATA1", envir = environment()) data("DATA2", envir = environment()) DATA3 <- combine_DataFiles(L1 = DATA2, L2 = DATA1) str(DATA3)
The data structure should become as follows
list
s (1 list
per sample) for DATA$LT
, DATA$sLT
, DATA1$ITimes
and DATA1$regDose
matrix
with 2 columns (1 line per sample) for DATA1$dLab
, DATA1$ddot_env
integer
s (1 integer
per BIN files here we have 1 BIN-file per sample) for DATA1$J
, DATA1$K
, DATA1$Nb_measurement
.Single-grain and multiple-grain OSL measurements can be merged in the same way.
To plot the $L/T$ as a function of the regenerative dose the function LT_RegenDose()
can be
used again:
plot_RegDosePoints(DATA3)
Note: In the example DATA3
contains information from the samples 'GDB3' and 'GDB5', which are single-grain OSL measurements. For a correct treatment the argument SG
has to be manually set by the user. Please see the function manual for further details.
If no stratigraphic constraints were set, the following code can be used to analyse the age of the sample GDB5 and GDB3 simultaneously.
priorage = c(1, 10, 10, 100) Age <- AgeS_Computation( DATA = DATA3, Nb_sample = 2, SampleNames = c("GDB5", "GDB3"), PriorAge = priorage, distribution = "cauchy", LIN_fit = TRUE, Origin_fit = FALSE, Iter = 1000, jags_method = "rjags" )
Note: For an automated parallel processing you can set the argument jags_method = "rjags"
to
jags_method = "rjparallel"
.
As for the function Age_computation()
, the age for each sample is set by default between 0.01 ka and 100 ka.
If you have more informations on your samples it is possible to change PriorAge
parameters.
PriorAge
is a vector of size = 2*$Nb_sample
, the two first values of PriorAge
concern the 1st sample, the next two values the 2nd sample and so on.
For example, if you know that sample named GDB5 is a young sample whose its age is between 0.01 ka and 10 ka, and GDB3 is an old sample whose age is between 10 ka and 100 ka, $$PriorAge=c(\underbrace{0.01,10}{GDB5\ prior\ age},\underbrace{10,100}{GDB3\ prior\ age})$$
With the function AgeS_Computation()
it is possible to take the stratigraphic
relations between samples into account and define constraints.
For example, we know that GDB5 is in a higher stratigraphical position, hence it likely has a younger age than sample GDB3.
To take into account stratigraphic constraints, the information on the samples need to be ordered.
Either you enter a sample name (corresponding to subfolder names) in Names
parameter of the function Generate_DataFile()
, ordered by order of increasing ages or you enter saved .RData informations of each sample in combine_DataFiles()
, ordered by increasing ages.
# using Generate_DataFile function Names <- c("samp1", "samp2") nbsample <- 2 DATA3 <- Generate_DataFile( Path = path, FolderNames = Names, Nb_sample = nbsample, verbose = FALSE )
# using the function combine_DataFiles() data(DATA1, envir = environment()) # .RData on sample GDB3 data(DATA2, envir = environment()) # .RData on sample GDB5 DATA3 <- combine_DataFiles(L1 = DATA1, L2 = DATA2)
Let SC
be the matrix containing all information on stratigraphic relations for this two samples.
This matrix is defined as follows:
matrix dimensions: the row number of StratiConstraints
matrix is equal to Nb_sample+1
,
and column number is equal to $Nb_sample$.
first matrix row: for all $i$ in ${1,...,Nb_Sample}$, StratiConstraints[1,i] <- 1
,
means that the lower bound of the sample age given in PriorAge[2i-1]
for the sample whose number ID
is equal to $i$ is taken into account
sample relations: for all $j$ in ${2,...,Nb_Sample+1}\$ and all $i$ in ${j,...,Nb_Sample}$,
StratiConstraints[j,i] <- 1
if the sample age whose ID is equal to $j-1$ is lower than the sample age whose ID is equal to $i$.
Otherwise, StratiConstraints[j,i] <- 0
.
To the define such matrix the function SCMatrix() can be used:
SC <- SCMatrix(Nb_sample = 2, SampleNames = c("samp1", "samp2"))
In our case: 2 samples, SC
is a matrix with 3 rows and 2 columns. The first row contains c(1,1)
(because we take into account the prior ages), the second line contains c(0,1)
(sample 2, named samp2 is supposed to be older than sample 1, named samp1) and the third line contains c(0,0)
(sample 2, named samp2 is not younger than the sample 1, here named samp1). We can also fill the matrix with the stratigraphic relations as follow:
SC <- matrix( data = c(1, 1, 0, 1, 0, 0), ncol = 2, nrow = (2 + 1), byrow = T )
Age <- AgeS_Computation( DATA = DATA3, Nb_sample = 2, SampleNames = c("samp1", "samp2"), PriorAge = priorage, distribution = "cauchy", LIN_fit = TRUE, Origin_fit = FALSE, StratiConstraints = SC, Iter = 1000, jags_method = 'rjags')
Thee results can be also be used for an alternative graphical representation:
plot_Ages(Age, plot_mode = "density")
If MCMC trajectories did not converge, it means we should run additional MCMC iterations.
For AgeS_computation()
and Age_OSLC14()
models we can run additional iterations by supplying the function output back into the parent function.
In the following, notice we are using the output of the previous AgeS_computation()
example, namely Age
. The key argument to set/change is DATA
.
Age <- AgeS_Computation( DATA = Age, Nb_sample = 2, SampleNames = c("GDB5", "GDB3"), PriorAge = priorage, distribution = "cauchy", LIN_fit = TRUE, Origin_fit = FALSE, Iter = 1000, jags_method = "rjags" )
Combès, B., Philippe, A., Lanos, P., Mercier, N., Tribolo, C., Guerin, G., Guibert, P., Lahaye, C., 2015. A Bayesian central equivalent dose model for optically stimulated luminescence dating. Quaternary Geochronology 28, 62-70. doi: 10.1016/j.quageo.2015.04.001
Combès, B., Philippe, A., 2017. Bayesian analysis of individual and systematic multiplicative errors for estimating ages with stratigraphic constraints in optically stimulated luminescence dating. Quaternary Geochronology 39, 24–34. doi: 10.1016/j.quageo.2017.02.003
Philippe, A., Guérin, G., Kreutzer, S., 2019. BayLum - An R package for Bayesian analysis of OSL ages: An introduction. Quaternary Geochronology 49, 16-24. doi: 10.1016/j.quageo.2018.05.009
Robert and Casella, 2009. Introducing Monte Carlo Methods with R. Springer Science & Business Media.
Tribolo, C., Asrat, A., Bahain, J. J., Chapon, C., Douville, E., Fragnol, C., Hernandez, M., Hovers, E., Leplongeon, A., Martin, L., Pleurdeau, D., Pearson, O., Puaud, S., Assefa, Z., 2017. Across the Gap: Geochronological and Sedimentological Analyses from the Late Pleistocene-Holocene Sequence of Goda Buticha, Southeastern Ethiopia. PloS one, 12(1), e0169418. doi: 10.1371/journal.pone.0169418
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.