Nothing
#' Partial Least Squares analysis of phenology vs. accumulated daily chill and
#' heat
#'
#' This function conducts a Partial Least Squares (PLS) regression analysis
#' relating an annual biological phenomenon, e.g. fruit tree flowering or leaf
#' emergence, to mean daily rates of chill (with three models) and heat
#' accumulation of the preceding 12 months. It produces figures that illustrate
#' statistical correlations between temperature variation during certain phases
#' and the timing of phenological events.
#'
#' PLS regression is useful for exploring the relationship between daily chill
#' and heat accumulation rates and biological phenomena that only occur once
#' per year. The statistical challenge is that a normally quite small number of
#' observations must be related to variation in a much larger number (730) of
#' daily chill and heat values, which are also highly autocorrelated. Most
#' regression approaches are not suitable for this, but PLS regression offers a
#' potential solution. The method is frequently used in chemometrics and
#' hyperspectral remote sensing, where similar statistical challenges are
#' encountered. The basic mechanism is that PLS first constructs latent factors
#' (similar to principal components) from the independent data (daily chill and
#' heat accumulation) and then uses these components for the regression. The
#' contribution of each individual variable to the PLS model is then evaluated
#' with two main metrics: the Variable Importance in the Projection statistic
#' (VIP) indicates how much variation in a given independent variable is
#' correlated with variation in the dependent variable. A threshold of 0.8 is
#' often used for determining importance. The standardized model coefficients
#' of the PLS model then give an indication of the direction and strength of
#' the effect, e.g. if coefficients are positive and high, high values for the
#' respective independent variable are correlated with high values of the
#' dependent variable (e.g. late occurrence of a phenological stage). This
#' procedure was inspired by the challenge of explaining variation in bloom and
#' leaf emergence dates of temperate fruit trees in Mediterranean climates.
#' These are generally understood to result from (more of less) sequential
#' fulfillment of a chilling and a forcing requirement. During the chilling
#' phase, cool temperatures are needed; during the forcing phase, trees need
#' heat. There is no easily visible change in tree buds that would indicate the
#' transition between these two phases, making it difficult to develop a good
#' model of these processes. Where long-term phenology data are available and
#' can be coupled with daily chill and heat records (derived from daily
#' temperature data), PLS regression allows detection of the chilling/forcing
#' transition. This procedure has not often been applied to biological
#' phenomena at the time of writing this, and there may be constraints to how
#' generally applicable it is. Yet is has passed the test of scientific peer
#' review a few times, and it has produced plausible results in a number of
#' settings. This package draws heavily from the pls package.
#'
#' Per default, chill metrics used are the ones given in the references below.
#' Chilling Hours are all hours with temperatures between 0 and 7.2 degrees C.
#' Units of the Utah Model are calculated as suggested by Richardson et al.
#' (1974) (different weights for different temperature ranges, and negation of
#' chilling by warm temperatures). Chill Portions are calculated according to
#' Fishman et al. (1987a,b). More honestly, they are calculated according to an
#' Excel sheet produced by Amnon Erez and colleagues, which converts the
#' complex equations in the Fishman papers into relatively simple Excel
#' functions. These were translated into R. References to papers that include
#' the full functions are given below. Growing Degree Hours are calculated
#' according to Anderson et al. (1986), using the default values they suggest.
#'
#' It is possible, however, for the user to specify other metrics to be
#' evaluated. These should be indicated by the chill_models and heat_models
#' parameters, which should contain the names of the respective columns of the
#' daily_chill_obj$daily_chill data frame.
#'
#' @param daily_chill_obj a daily chill object. This should be generated with
#' the daily_chill function.
#' @param bio_data_frame a data frame that contains information on the timing
#' of phenological events by year. It should consist of two columns called Year
#' and pheno. Data in the pheno column should be in Julian date (day of the
#' year).
#' @param split_month the procedure analyzes data by phenological year, which
#' can start and end in any month during the calendar year (currently only at
#' the beginning of a month). This variable indicates the last month (e.g. 5
#' for May) that should be included in the record for a given phenological
#' year. All subsequent months are assigned to the following phenological year.
#' @param expl.var percentage of the variation in the dependent variable that
#' the PLS model should explain. This is used as a threshold in finding the
#' appropriate number of components in the PLS regression procedure.
#' @param ncomp.fix fixed number of components for the PLS model. Defaults to
#' NULL, so that the number is automatically determined, but it can also be set
#' by the user.
#' @param return.all boolean variable indicating whether or not the full set of
#' PLS results should be returned by the function. If this is set to TRUE, the
#' function output is a list with two elements (besides the object_type
#' string): PLS_summary and PLS_output; if it is set to FALSE, only the
#' PLS_summary is returned.
#' @param crossvalidate character variable indicating what kind of validation
#' should be performed by the PLS procedure. This defaults to "none", but the
#' plsr function (of the pls package) also takes "CV" and "LOO" as inputs. See
#' the documentation for the plsr function for details.
#' @param end_at_pheno_end boolean variable indicating whether the analysis
#' should disregard temperatures after the last date included in the
#' bio_data_frame dataset. If set to TRUE, only temperatures up this date are
#' considered. Phenology data is extracted from the PLS output files. If this
#' parameter is assigned a numeric value, only data up to the Julian date
#' specified by this number are considered.
#' @param chill_models Character vector containing names of chill models that
#' should be considered in the PLS regression. These names should correspond to
#' column names of daily_chill. This defaults to c("Chilling_Hours",
#' "Utah_Chill_Units", "Chill_Portions").
#' @param heat_models Character vector containing names of heat models that
#' should be considered in the PLS regression. These names should correspond to
#' column names of daily_chill. This defaults to c("GDH").
#' @param runn_means numeric vector specifying whether inputs to the PLS calculation
#' should be processed by a running mean filter. This usually enhances the clarity of
#' results. This vector contains either one element (an integer), in which case the
#' same filter is applied to all input metrics, or one element for each model, which
#' allows specifying metric-specific running means. In this case the sequence of numbers
#' should correspond to the sequence specified in the function call, with chill models
#' listed first, followed by heat models.
#' @param metric_categories while the original application of this function is the
#' calculation of tree responses to chill and heat accumulation, it can also be applied
#' for other variables. In this case, you may not want the outputs to be called 'Chill'
#' and 'Heat' (the default). Here you can specify a character vector of length 2, which
#' contains the labels you want to appear in the output table.
#' @return \item{object_type}{ the character string "PLS_chillforce_pheno".
#' This is only needed for choosing the correct method for the plot_PLS
#' function.} \item{pheno}{ a data frame containing the phenology data used for
#' the PLS regression, with columns Year and pheno.}
#' \item{<chill_model>$<heat_model>}{ for each combination of elements from
#' chill_models and heat_models, a list element is generated, which contains a
#' list with elements PLS_summary and (if(return.all=TRUE) PLS_output. These
#' contain the results of the PLS analysis that used the respective chill and
#' heat metrics as independent variables.}
#' @note After doing extensive model comparisons, and reviewing a lot of
#' relevant literature, I do not recommend using the Chilling Hours or Utah
#' Models, especially in warm climates! The Dynamic Model (Chill Portions),
#' though far from perfect, seems much more reliable.
#' @author Eike Luedeling, with contributions from Sabine Guesewell
#' @references Model references, for the default option:
#'
#' Chilling Hours:
#'
#' Weinberger JH (1950) Chilling requirements of peach varieties. Proc Am Soc
#' Hortic Sci 56, 122-128
#'
#' Bennett JP (1949) Temperature and bud rest period. Calif Agric 3 (11), 9+12
#'
#' Utah Model:
#'
#' Richardson EA, Seeley SD, Walker DR (1974) A model for estimating the
#' completion of rest for Redhaven and Elberta peach trees. HortScience 9(4),
#' 331-332
#'
#' Dynamic Model:
#'
#' Erez A, Fishman S, Linsley-Noakes GC, Allan P (1990) The dynamic model for
#' rest completion in peach buds. Acta Hortic 276, 165-174
#'
#' Fishman S, Erez A, Couvillon GA (1987a) The temperature dependence of
#' dormancy breaking in plants - computer simulation of processes studied under
#' controlled temperatures. J Theor Biol 126(3), 309-321
#'
#' Fishman S, Erez A, Couvillon GA (1987b) The temperature dependence of
#' dormancy breaking in plants - mathematical analysis of a two-step model
#' involving a cooperative transition. J Theor Biol 124(4), 473-483
#'
#' Growing Degree Hours:
#'
#' Anderson JL, Richardson EA, Kesner CD (1986) Validation of chill unit and
#' flower bud phenology models for 'Montmorency' sour cherry. Acta Hortic 184,
#' 71-78
#'
#' Model comparisons and model equations:
#'
#' Luedeling E, Zhang M, Luedeling V and Girvetz EH, 2009. Sensitivity of
#' winter chill models for fruit and nut trees to climatic changes expected in
#' California's Central Valley. Agriculture, Ecosystems and Environment 133,
#' 23-31
#'
#' Luedeling E, Zhang M, McGranahan G and Leslie C, 2009. Validation of winter
#' chill models using historic records of walnut phenology. Agricultural and
#' Forest Meteorology 149, 1854-1864
#'
#' Luedeling E and Brown PH, 2011. A global analysis of the comparability of
#' winter chill models for fruit and nut trees. International Journal of
#' Biometeorology 55, 411-421
#'
#' Luedeling E, Kunz A and Blanke M, 2011. Mehr Chilling fuer Obstbaeume in
#' waermeren Wintern? (More winter chill for fruit trees in warmer winters?).
#' Erwerbs-Obstbau 53, 145-155
#'
#' Review on chilling models in a climate change context:
#'
#' Luedeling E, 2012. Climate change impacts on winter chill for temperate
#' fruit and nut production: a review. Scientia Horticulturae 144, 218-229
#'
#' The PLS method is described here:
#'
#' Luedeling E and Gassner A, 2012. Partial Least Squares Regression for
#' analyzing walnut phenology in California. Agricultural and Forest
#' Meteorology 158, 43-52.
#'
#' Wold S (1995) PLS for multivariate linear modeling. In: van der Waterbeemd H
#' (ed) Chemometric methods in molecular design: methods and principles in
#' medicinal chemistry, vol 2. Chemie, Weinheim, pp 195-218.
#'
#' Wold S, Sjostrom M, Eriksson L (2001) PLS-regression: a basic tool of
#' chemometrics. Chemometr Intell Lab 58(2), 109-130.
#'
#' Mevik B-H, Wehrens R, Liland KH (2011) PLS: Partial Least Squares and
#' Principal Component Regression. R package version 2.3-0.
#' http://CRAN.R-project.org/package0pls.
#'
#' Some applications of the PLS procedure:
#'
#' Luedeling E, Kunz A and Blanke M, 2013. Identification of chilling and heat
#' requirements of cherry trees - a statistical approach. International Journal
#' of Biometeorology 57,679-689.
#'
#' Yu H, Luedeling E and Xu J, 2010. Stronger winter than spring warming delays
#' spring phenology on the Tibetan Plateau. Proceedings of the National Academy
#' of Sciences (PNAS) 107 (51), 22151-22156.
#'
#' Yu H, Xu J, Okuto E and Luedeling E, 2012. Seasonal Response of Grasslands
#' to Climate Change on the Tibetan Plateau. PLoS ONE 7(11), e49230.
#'
#' The exact procedure was used here:
#'
#' Luedeling E, Guo L, Dai J, Leslie C, Blanke M, 2013. Differential responses
#' of trees to temperature variation during the chilling and forcing phases.
#' Agricultural and Forest Meteorology 181, 33-42.
#'
#' The chillR package:
#'
#' Luedeling E, Kunz A and Blanke M, 2013. Identification of chilling and heat
#' requirements of cherry trees - a statistical approach. International Journal
#' of Biometeorology 57,679-689.
#' @keywords phenology analysis chill and heat calculation
#' @examples
#'
#' weather<-fix_weather(KA_weather[which(KA_weather$Year>2004),])
#' #Plots look much better with weather<-fix_weather(KA_weather)
#' #but that takes too long to run for passing CRAN checks
#'
#' dc<-daily_chill(stack_hourly_temps(weather,50.4), 11)
#' plscf<-PLS_chill_force(daily_chill_obj=dc, bio_data_frame=KA_bloom, split_month=6)
#'
#' #PLS_results_path<-paste(getwd(),"/PLS_output",sep="")
#' #plot_PLS(plscf,PLS_results_path)
#' #plot_PLS(plscf,PLS_results_path,add_chill=c(307,19),add_heat=c(54,109))
#'
#'
#'
#'
#' @export PLS_chill_force
PLS_chill_force<-function (daily_chill_obj, bio_data_frame, split_month, expl.var = 30,
ncomp.fix = NULL, return.all = FALSE, crossvalidate = "none",end_at_pheno_end=TRUE,
chill_models=c("Chilling_Hours","Utah_Chill_Units","Chill_Portions"),
heat_models=c("GDH"),runn_means=1,metric_categories=c("Chill","Heat"))
{
if (!daily_chill_obj[[1]] == "daily_chill")
stop("Error: not a daily chill object; use function daily_chill to make one from hourly temperature data")
dc <- daily_chill_obj$daily_chill
weather_file <- daily_chill_obj$daily_chill
bio_data <- bio_data_frame
weather_file[which(weather_file$Month <= split_month),
"Season"] <- weather_file[which(weather_file$Month <=
split_month), "Year"]
weather_file[which(weather_file$Month > split_month),
"Season"] <- weather_file[which(weather_file$Month >
split_month), "Year"] + 1
weather_file[, "Date"] <- weather_file$Month * 100 +
weather_file$Day
weather_file[, "JDay"] <- strptime(paste(weather_file$Month,
"/", weather_file$Day, "/", weather_file$Year, sep = ""),
"%m/%d/%Y")$yday + 1
pls_ncomp_old <- function(indep, dep, threshold = 30) {
dat <- data.frame(dep)
dat$runn <- indep
if (length(dep) > 15) {
suppressWarnings(pls_out <- plsr(dep ~ runn,
data = dat, ncomp = 10, validation = "CV"))
suppressWarnings(pls_cv <- crossval(pls_out,
segments = 10))
res <- data.frame(ncomp = c(1:10), explvar = explvar(pls_out),
cumul = NA)
res$cumul[1] <- res$explvar[1]
for (i in 2:nrow(res)) {
res$cumul[i] <- res$cumul[i - 1] + res$explvar[i]
}
ncomp <- which(res$cumul > threshold)[1]
if (is.na(ncomp))
ncomp <- 10
}
else ncomp <- 2
return(ncomp)
}
pls_ncomp <- function(indep, dep, threshold) {
dat <- data.frame(dep)
dat$runn <- indep
if (length(dep) > 15) {
suppressWarnings(pls_out <- plsr(dep ~ runn,
data = dat, ncomp = 10, validation = "none"))
ncomp <- which(cumsum(explvar(pls_out)) > threshold)[1]
if (is.na(ncomp))
ncomp <- 10
}
else ncomp <- 2
return(ncomp)
}
seasons <- unique(weather_file$Season)
#chill_models <- c("Chilling_Hours", "Utah_Chill_Units",
# "Chill_Portions")
#heat_models <- c("GDH")
#apply running mean function
all_models<-c(chill_models,heat_models)
if(!length(runn_means) %in% c(1,length(all_models)))
stop("runn_means has to be either 1 or an integer vector with as many elements as
the number of models considered (chill and heat models combined)")
if(length(runn_means)==1) runner<-rep(runn_means,length(all_models))
if(length(runn_means)==length(all_models)) runner<-runn_means
for(i in 1:length(all_models))
weather_file[,all_models[i]]<-runn_mean(weather_file[,all_models[i]],runner[i])
all_outputs <- list(object_type = "PLS_chillforce_pheno")
for (CM in chill_models) for (HM in heat_models) {
for (yy in seasons) {
yearweather <- weather_file[which(weather_file$Season ==
yy), ]
weathervector <- c(yearweather[1:365, CM], yearweather[1:365,
HM])
if (yy == seasons[1])
year_res <- weathervector else year_res <- rbind(year_res, weathervector)
if (nrow(yearweather) == 365) {
labdates <- yearweather$Date
labJdates <- yearweather$JDay
}
}
colnames(year_res) <- c(paste("Chill_", 1:365, sep = ""),
paste("Heat_", 1:365, sep = ""))
year_res <- cbind(Season = seasons, year_res)
data_set <- year_res
full_seasons <- which(!is.na(rowMeans(data_set)))
data_set <- data_set[full_seasons, ]
newseasons <- data_set[, "Season"]
suppressWarnings(bio_data <- bio_data[which(bio_data[,
"Year"] %in% newseasons), ])
suppressWarnings(bio_data <- bio_data[which(!is.na(as.numeric(as.character(bio_data$pheno)))),
])
all_outputs[["pheno"]]<-bio_data
suppressWarnings(bio <- as.numeric(as.character(bio_data$pheno)))
indep <- as.matrix(data_set[which(data_set[, "Season"] %in%
bio_data$Year), ])
indep <- indep[, 2:ncol(indep)]
#pheno_end<-max(as.numeric(as.character(bio_data$pheno)))
pheno_end<-get_last_date(as.numeric(as.character(bio_data$pheno)))
if(is.numeric(end_at_pheno_end)) pheno_end<-end_at_pheno_end
if(end_at_pheno_end) if (pheno_end %in% labJdates) {dayskeep<-1:which(labJdates==pheno_end)
labJdates<-labJdates[dayskeep]
labdates<-labdates[dayskeep]
indep<-indep[,c(dayskeep,365+dayskeep)]}
if (is.null(ncomp.fix)) {
ncomp <- pls_ncomp(indep = indep, dep = bio,
threshold = expl.var)
} else { ncomp <- ncomp.fix }
sdindep <- apply(indep, 2, sd)
sdindep[which(sdindep == 0)] <- 1
PLS_output <- plsr(bio ~ indep, ncomp, validation = "none",
method = "oscorespls", scale = sdindep)
d1 <- switch(split_month, 31, 59, 89, 120, 151, 181,
212, 243, 274, 304, 335, 365)
labJdates[labJdates > d1] <- labJdates[labJdates >
d1] - 365
out <- data.frame()
tablength<-length(coef(PLS_output))
out[1:tablength, "Date"] <- labdates
out[1:tablength, "Type"] <- c(rep(metric_categories[1], tablength/2), rep(metric_categories[2],
tablength/2))
out[1:tablength, "JDay"] <- labJdates
out[1:tablength, "Coef"] <- coef(PLS_output)
if(ncomp==1) out[1:tablength,"VIP"] <- VIP(PLS_output)
if(ncomp>1) out[1:tablength,"VIP"] <- VIP(PLS_output)[ncomp, ]
out[1:tablength, "MetricMean"] <- colMeans(indep)
out[1:tablength, "MetricStdev"] <- apply(indep, 2, sd, na.rm = TRUE)
if (return.all)
all_outputs[[CM]][[HM]] <- list(PLS_summary = out,
PLS_output = PLS_output) else all_outputs[[CM]][[HM]] <- list(PLS_summary = out)
}
return(all_outputs)
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.