library(treeclim) suppressWarnings(RNGversion("3.5.0")) set.seed(42)
treeclim is an R package for the numerical calibration of proxy-climate relationships, developed with a focus on tree-ring data.
The workhorse function of treeclim is dcc
, shorthand for "dendroclimatic calibration". In
its most simplest case, a growth-climate analysis can be performed
like this:
dc1 <- dcc(muc_spruce, muc_clim)
For this first example, we use data that comes with treeclim: a
chronology of Picea abies growing near Munich (Bavaria, Germany)
together with corresponding monthly climate data (temperature and
precipitation). The default configuration of dcc
is to perform a
bootstrapped response function analysis (Guiot 1991), using monthly climate
variables from previous June to current September. The main difference
here to other software (e.g., Dendroclim2002, Biondi and Waikul 2004) is that treeclim
uses stationary bootstrapping (Politis and Romano 1994) by default, to account for potentially
autocorrelated tree-ring data.
The resulting coefficients can be viewed by calling:
dc1
We see a table listing the monthly climatic predictors that have been
generated by treeclim with the value of their corresponding
coefficients, significance flags and confidence intervals. In this
example, none of the climatic predictors turned out to be significant
at a level of p < 0.05. (One important notice: the object returned by
dcc
is actually more complex than the table we get when print it. To
access the coefficients, e.g. for writing your own plotting functions,
you could use coef(dc1)
.)
A basic plot is obtained by issueing:
plot(dc1)
We will see later how this plot can be styled to meet your personal preferences.
This was a very quick and basic introduction to the main functionality of treeclim. Everything else beyond that is refining the specification of your analysis. The next section details on the conceptual framework of treeclim and how it reflects in its design and usage.
Caution! Please make sure to have a recent version of treeclim (version >= 1.0.15) installed when you follow along the examples in this document.
I enjoy writing treeclim and helping users, and Franco Biondi (of Dendroclim2002 fame) enjoys mentoring my work and collaborating closely to set the direction for this software package. Additional to improving citation metrics, getting proper credits for our work through correct citation also helps us to get an idea about who uses treeclim and for what. So, please make sure to cite treeclim correctly when you use it in your analyses:
print(citation("treeclim"), style = "text")
The same applies to R in general and dplR, when you use it together with treeclim (which you definetively should, of course).
Everyone looking at tree-growth as a function of climate will have to consider at least five elements of the analysis to be carried out:
The design of treeclim allows to build an analytic strategy considering these five elements. In the following, we will look at each element in some detail.
treeclim expects a tree-ring chronology as input, ideally as
produced by the chron
function in dplR. For climate data, two
general formats are accepted:
data.frame
with columns year
, month
, and one column for each
climatic variable like so:head(muc_clim)
It is important to note that treeclim can only work with data for which all years are complete, which means you might have to truncate your climate data if this is not the case.
data.frame
or numerical matrix is supplied with year as first column and
monthly observations in columns 2 to 13 like so:head(rt_prec)
To use more than one variable with this data format, you can supply
them as a list to dcc like so: dcc(chrono = rt_spruce, climate =
list(rt_temp, rt_prec))
For both climate data formats, there is no limit on the number of variables that can be used for the analyses. Please note that no missing values are allowed for both formats, so if your climate data contains missing values, consider interpolation or other imputation methods.
For the 13-Column format, there are no obvious variable names for
the climatic variables. If you want to have names, you can supply them
using the var_names
argument like so: dcc(chrono = rt_spruce, climate =
list(rt_temp, rt_prec), var_names = c("tmp", "pre"))
. This also works
for renaming variables coming from the Year-Month format.
Being able to design a customized matrix of predictors is probably the
most interesting part about dcc
. The default design matrix will
select all months from previous June to current September from all
variables supplied to dcc
. Since this is most of the times a good
starting point, but not necessarily the ideal design matrix, you can
easily specify a custom design matrix using a set of simple commands.
In treeclim, there are three modifier functions for building the design matrix.
.range
: for selecting all months individually.mean
: for building the mean over the specified months.sum
: for summing up over the specified monthsAll modifiers have the same structure of MODIFIER(.months,
.variables)
, i.e. they take as arguments a number of months, and the
variables for which the modifier shall be applied. Both arguments can
be omitted; when .months
is omitted, the default set of months
(previous June to current September) is used, when .variables
is
omitted, the modifier will be applied to all available climate
variables.
Here are two examples:
dc2 <- dcc(muc_spruce, muc_clim, .range(-6:9, c("temp", "prec"))) rownames(dc2$coef)
This is identical with the default configuration and can be written more
concise as dcc(muc_spruce, muc_clim, -6:9)
. Another example:
dc3 <- dcc(muc_spruce, muc_clim, .sum(5:9, c("temp", "prec"))) dc3
In this example, we did use all individual months, but as a sum of
temperature and precipitation for the given period of time, resulting
in much smaller design matrix with only two predictor variables. Since
we only have the variables temp
and prec
in muc_clim
, we could
have written this more concise as dcc(muc_spruce, muc_clim,
.sum(5:9))
.
It can be desireable to exclude some time spans from analysis, for
example dormant seaons. For this task, treeclim has the function
exclude_from
(or exfr
for short) to exclude specific months from
being integrated into the predictors build using the modifier
functions.
dc4 <- dcc(muc_spruce, muc_clim, .range(exfr(-6:9, -10:2))) rownames(dc4$coef)
In this example, we excluded autumn and winter months October (previous year) to February (current year) from the analysis. The remaining months from .range(-6:9)
were selected as individual months for building the design matrix, because we used the .range
modifier. Often, as here, it is not necessary to explicitly use .range
, and dcc(..., exfr(-6:9, -10:2))
would have done the same thing.
The last component for building design matrix is introduced by the
ability to chain modifiers. This is done using the +
operator:
dc5 <- dcc(muc_spruce, muc_clim, .sum(-4:-9, "prec") + .sum(4:9, "prec") + .range(exfr(-6:9, -10:3), "temp")) dc5
Here, we are using two precipitation sums (corresponding to the previous and current growing seasons), and a range of month-wise temperatures excluding a dormant season. We can chain as many modifiers as we want. Using modifiers, exclusion and modifier chaining, we can create as complex (or simple) design matrices as needed.
While treeclim will check carefully that no two predictors are exactly the same, it can by no means check if the specification of the design matrix makes any sense at all.
By default, treeclim will use the maximum timespan available for both tree-ring and climate data. This can be changed using the timespan
argument: dcc(..., timespan = c(1965, 2000))
will run the analysis using the timespan from 1965 to 2000. When the number of years for the analysis becomes smaller than the number of predictors in the design matrix, treeclim will raise an error.
With dcc
, one may choose between response function analysis (the default) and correlation function analysis using the method
argument: dcc(..., method = "correlation")
will switch to the correlation method. A third method is available: seasonal correlations. The rationale behind this approach developed by Dave Meko (Meko et al. 2011) is somewhat different, which is why it deserves its own function seascorr
.
The temporal stability of climate-growth-relations can be analysed using moving response and correlation functions. By default, dcc
computes static analyses, but by setting the moving
argument to TRUE
, the analyses will be repeated in shifting time windows.
dc6 <- dcc(chrono = muc_spruce, climate = muc_clim, selection = 4:10, moving = TRUE, sb = FALSE) dc6 plot(dc6)
The stars in the plot indicate windows with significant correlations for the given variable. Like in the non-dynamic case, you are free to create a customized design matrix using the treeclim modifiers. Further options comprise the window size, the window offset (the gap between consecutive window starts), and whether to start with windowing from the most recent end (default), or from the least recent end.
If method
is set to correlation
for computing a moving correlation function, we can use g_test
to test the temporal fluctuations of the correlation coefficients for significance by comparing them to the magnitude of fluctuations that would occur by chance alone (using a multivariate extension of the approach by Gershunov et al. 2001):
dc7 <- dcc(chrono = muc_spruce, climate = muc_clim, selection = .mean(3:5) + .mean(6:8) + .mean(9:10), method = "cor", moving = TRUE, win_size = 30, sb = FALSE) plot(dc7) g_test(dc7, sb = FALSE)
In this example, only the fluctuations in the correlation coefficient for mean temperature from current June to current August are significant at a level of p < 0.05. This means, that the probability of an even larger fluctuation by chance is less than 0.05. For more details, look into ?g_test
.
treeclim provides similar functionality to Dave Meko's seascorr MATLAB program: partial correlations for a primary and a secondary climatic variable for seasons of different lengths. The tree-ring and climate input data is required in the same format as for dcc
.
sc1 <- seascorr(muc_spruce, muc_clim) sc1
This is the default configuration for seascorr
: considering three different season lengths (1, 3, and 6 months), and current September as month where tree-ring growth is assumed to have finished. There is a plot method available as well:
plot(sc1)
By default, the primary variable is the first one found in the climate data supplied to seascorr
. So with r names(muc_clim)
in muc_clim
, precipitation is chosen as primary variable. You can change this by either supplying names or positions, so to swap primary and secondary variables from the last example we can issue:
sc2 <- seascorr(muc_spruce, muc_clim, primary = "prec", secondary = "temp") plot(sc2)
treeclim offers some basic functionality for evaluating the reconstruction skills of tree-ring chronologies via the skills
function. skills
works on existing output from either dcc
or seascorr
and tests the prediction skills of the tree-ring data for a given climate target. For this example, we use a different built-in data set, a chronology of pines from Visdalen in Norway (consult ?norw015
for details) and associated climate data:
dc8 <- dcc(norw015, list(norway_prec, norway_temp), 3:9, var_names = c("prec", "temp")) plot(dc8)
Let us assume we are interested in reconstructing summer temperatures, and want to make use of the significant correlation with July temperature. We call skills
on the dc8
object, and specify .range(7, "temp")
as target to get July temperature as response variable (note that for skills
, like in many dendroclimatic applications, climate is modelled subject to tree-ring data):
sk8 <- skills(dc8, .range(7, "temp")) sk8
By default, skills
will take the maximum available timespan and divide it into equally sized calibration and verification periods. Reported are the parameter estimates and goodness of fit for the calibration and full models, plus some established verification statistics. Here, both RE and CE are positive, so the reconstruction has some skills at least. We can also look at the reconstruction:
plot(sk8)
The specification of the calibration period in skills
is quite flexible, you might want to look in to ?skills
for more details. Furthermore, you can chose between an OLS regression model (default) or an RMA regression.
treeclim uses ggplot2 (Wickham 2009) to build its plots. ggplot2-plots are themeable and easily be customized. This means, it only takes a few lines of code to adapt your treeclim output to your aesthetical preferences.
plot(dc1)
This is the default. If we feel the default style does not blend in nicely with our gothic inspired interior design, we can always change some things (we need to include ggplot2 for this):
library(ggplot2) plot(dc1) + theme_dark() + scale_color_manual(values = c("#FFFFFF", "#F62525")) + ylab("Response Coefficients") + ggtitle("Happy Worlddendro 2016!")
You may want to consult the ggplot2 documentation for further options (there are many).
treeclim tries to provide a flexible framework for the exploration of tree-climate-relationships. treeclim does not impose complexity on the user, and keeps simple things simple. When needed, however, functionality is available to adapt to various research questions. Most aspects of what treeclim can do has been touched in this vignette. There are few more options that are potentially interesting, such as the ability to use different bootstrapping schemes with dcc
.
Areas where treeclim is currently lacking include better support of southern hemisphere data by providing access to climate data from the year before the previous one. Also, testing reconstructing skills is only basic.
Some of the features currently implemented derive from requests by the user community (seascorr
being the most prominent example), and I am grateful for these and further suggestions.
Biondi, F., and K. Waikul. 2004. DENDROCLIM2002: A C++ program for statistical calibration of climate signals in tree-ring chronologies. Computers in Geosciences 30:303–311.
Gershunov, A., N. Schneider, and T. Barnett. 2001. Low-frequency modulation of the ENSO-Indian Monsoon rainfall relationship: Signal or noise? Journal of Climate 14:2486–2492.
Guiot, J. 1991. The bootstrapped response function. Tree Ring Bulletin 51:39–41.
Meko, D. M., R. Touchan, and K. J. Anchukaitis. 2011. Seascorr: A MATLAB program for identifying the seasonal climate signal in an annual tree-ring time series. Computers & Geosciences 37:1234–1241.
Politis, D. N., and J. P. Romano. 1994. The stationary bootstrap. Journal of the American Statistical Association 89:1303–1313.
H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2009.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.