# Partial Least Squares analysis of phenology vs. accumulated daily chill and heat

### Description

This function conducts a Partial Least Squares (PLS) regression analysis relating an annual biological phenomenon, e.g. fruit tree flowering or leaf emergence, to mean daily rates of chill (with three models) and heat accumulation of the preceding 12 months. It produces figures that illustrate statistical correlations between temperature variation during certain phases and the timing of phenological events.

### Usage

1 2 3 4 |

### Arguments

`daily_chill_obj` |
a daily chill object. This should be generated with the daily_chill function. |

`bio_data_frame` |
a data frame that contains information on the timing of phenological events by year. It should consist of two columns called Year and pheno. Data in the pheno column should be in Julian date (day of the year). |

`split_month` |
the procedure analyzes data by phenological year, which can start and end in any month during the calendar year (currently only at the beginning of a month). This variable indicates the last month (e.g. 5 for May) that should be included in the record for a given phenological year. All subsequent months are assigned to the following phenological year. |

`expl.var` |
percentage of the variation in the dependent variable that the PLS model should explain. This is used as a threshold in finding the appropriate number of components in the PLS regression procedure. |

`ncomp.fix` |
fixed number of components for the PLS model. Defaults to NULL, so that the number is automatically determined, but it can also be set by the user. |

`return.all` |
boolean variable indicating whether or not the full set of PLS results should be returned by the function. If this is set to TRUE, the function output is a list with two elements (besides the object_type string): PLS_summary and PLS_output; if it is set to FALSE, only the PLS_summary is returned. |

`crossvalidate` |
character variable indicating what kind of validation should be performed by the PLS procedure. This defaults to "none", but the plsr function (of the pls package) also takes "CV" and "LOO" as inputs. See the documentation for the plsr function for details. |

`end_at_pheno_end` |
boolean variable indicating whether the analysis should disregard temperatures after the last date included in the bio_data_frame dataset. If set to TRUE, only temperatures up this date are considered. Phenology data is extracted from the PLS output files. If this parameter is assigned a numeric value, only data up to the Julian date specified by this number are considered. |

`chill_models` |
Character vector containing names of chill models that should be considered in the PLS regression. These names should correspond to column names of daily_chill. This defaults to c("Chilling_Hours", "Utah_Chill_Units", "Chill_Portions"). |

`heat_models` |
Character vector containing names of heat models that should be considered in the PLS regression. These names should correspond to column names of daily_chill. This defaults to c("GDH"). |

### Details

PLS regression is useful for exploring the relationship between daily chill and heat accumulation rates and biological phenomena that only occur once per year. The statistical challenge is that a normally quite small number of observations must be related to variation in a much larger number (730) of daily chill and heat values, which are also highly autocorrelated. Most regression approaches are not suitable for this, but PLS regression offers a potential solution. The method is frequently used in chemometrics and hyperspectral remote sensing, where similar statistical challenges are encountered. The basic mechanism is that PLS first constructs latent factors (similar to principal components) from the independent data (daily chill and heat accumulation) and then uses these components for the regression. The contribution of each individual variable to the PLS model is then evaluated with two main metrics: the Variable Importance in the Projection statistic (VIP) indicates how much variation in a given independent variable is correlated with variation in the dependent variable. A threshold of 0.8 is often used for determining importance. The standardized model coefficients of the PLS model then give an indication of the direction and strength of the effect, e.g. if coefficients are positive and high, high values for the respective independent variable are correlated with high values of the dependent variable (e.g. late occurrence of a phenological stage). This procedure was inspired by the challenge of explaining variation in bloom and leaf emergence dates of temperate fruit trees in Mediterranean climates. These are generally understood to result from (more of less) sequential fulfillment of a chilling and a forcing requirement. During the chilling phase, cool temperatures are needed; during the forcing phase, trees need heat. There is no easily visible change in tree buds that would indicate the transition between these two phases, making it difficult to develop a good model of these processes. Where long-term phenology data are available and can be coupled with daily chill and heat records (derived from daily temperature data), PLS regression allows detection of the chilling/forcing transition. This procedure has not often been applied to biological phenomena at the time of writing this, and there may be constraints to how generally applicable it is. Yet is has passed the test of scientific peer review a few times, and it has produced plausible results in a number of settings. This package draws heavily from the pls package.

Per default, chill metrics used are the ones given in the references below. Chilling Hours are all hours with temperatures between 0 and 7.2 degrees C. Units of the Utah Model are calculated as suggested by Richardson et al. (1974) (different weights for different temperature ranges, and negation of chilling by warm temperatures). Chill Portions are calculated according to Fishman et al. (1987a,b). More honestly, they are calculated according to an Excel sheet produced by Amnon Erez and colleagues, which converts the complex equations in the Fishman papers into relatively simple Excel functions. These were translated into R. References to papers that include the full functions are given below. Growing Degree Hours are calculated according to Anderson et al. (1986), using the default values they suggest.

It is possible, however, for the user to specify other metrics to be evaluated. These should be indicated by the chill_models and heat_models parameters, which should contain the names of the respective columns of the daily_chill_obj$daily_chill data frame.

### Value

`object_type` |
the character string "PLS_chillforce_pheno". This is only needed for choosing the correct method for the plot_PLS function. |

`pheno` |
a data frame containing the phenology data used for the PLS regression, with columns Year and pheno. |

`<chill_model>$<heat_model>` |
for each combination of elements from chill_models and heat_models, a list element is generated, which contains a list with elements PLS_summary and (if(return.all=TRUE) PLS_output. These contain the results of the PLS analysis that used the respective chill and heat metrics as independent variables. |

### Note

After doing extensive model comparisons, and reviewing a lot of relevant literature, I do not recommend using the Chilling Hours or Utah Models, especially in warm climates! The Dynamic Model (Chill Portions), though far from perfect, seems much more reliable.

### Author(s)

Eike Luedeling, with contributions from Sabine Guesewell

### References

Model references, for the default option:

Chilling Hours:

Weinberger JH (1950) Chilling requirements of peach varieties. Proc Am Soc Hortic Sci 56, 122-128

Bennett JP (1949) Temperature and bud rest period. Calif Agric 3 (11), 9+12

Utah Model:

Richardson EA, Seeley SD, Walker DR (1974) A model for estimating the completion of rest for Redhaven and Elberta peach trees. HortScience 9(4), 331-332

Dynamic Model:

Erez A, Fishman S, Linsley-Noakes GC, Allan P (1990) The dynamic model for rest completion in peach buds. Acta Hortic 276, 165-174

Fishman S, Erez A, Couvillon GA (1987a) The temperature dependence of dormancy breaking in plants - computer simulation of processes studied under controlled temperatures. J Theor Biol 126(3), 309-321

Fishman S, Erez A, Couvillon GA (1987b) The temperature dependence of dormancy breaking in plants - mathematical analysis of a two-step model involving a cooperative transition. J Theor Biol 124(4), 473-483

Growing Degree Hours:

Anderson JL, Richardson EA, Kesner CD (1986) Validation of chill unit and flower bud phenology models for 'Montmorency' sour cherry. Acta Hortic 184, 71-78

Model comparisons and model equations:

Luedeling E, Zhang M, Luedeling V and Girvetz EH, 2009. Sensitivity of winter chill models for fruit and nut trees to climatic changes expected in California's Central Valley. Agriculture, Ecosystems and Environment 133, 23-31

Luedeling E, Zhang M, McGranahan G and Leslie C, 2009. Validation of winter chill models using historic records of walnut phenology. Agricultural and Forest Meteorology 149, 1854-1864

Luedeling E and Brown PH, 2011. A global analysis of the comparability of winter chill models for fruit and nut trees. International Journal of Biometeorology 55, 411-421

Luedeling E, Kunz A and Blanke M, 2011. Mehr Chilling fuer Obstbaeume in waermeren Wintern? (More winter chill for fruit trees in warmer winters?). Erwerbs-Obstbau 53, 145-155

Review on chilling models in a climate change context:

Luedeling E, 2012. Climate change impacts on winter chill for temperate fruit and nut production: a review. Scientia Horticulturae 144, 218-229

The PLS method is described here:

Luedeling E and Gassner A, 2012. Partial Least Squares Regression for analyzing walnut phenology in California. Agricultural and Forest Meteorology 158, 43-52.

Wold S (1995) PLS for multivariate linear modeling. In: van der Waterbeemd H (ed) Chemometric methods in molecular design: methods and principles in medicinal chemistry, vol 2. Chemie, Weinheim, pp 195-218.

Wold S, Sjostrom M, Eriksson L (2001) PLS-regression: a basic tool of chemometrics. Chemometr Intell Lab 58(2), 109-130.

Mevik B-H, Wehrens R, Liland KH (2011) PLS: Partial Least Squares and Principal Component Regression. R package version 2.3-0. http://CRAN.R-project.org/package0pls.

Some applications of the PLS procedure:

Luedeling E, Kunz A and Blanke M, 2013. Identification of chilling and heat requirements of cherry trees - a statistical approach. International Journal of Biometeorology 57,679-689.

Yu H, Luedeling E and Xu J, 2010. Stronger winter than spring warming delays spring phenology on the Tibetan Plateau. Proceedings of the National Academy of Sciences (PNAS) 107 (51), 22151-22156.

Yu H, Xu J, Okuto E and Luedeling E, 2012. Seasonal Response of Grasslands to Climate Change on the Tibetan Plateau. PLoS ONE 7(11), e49230.

The exact procedure was used here:

Luedeling E, Guo L, Dai J, Leslie C, Blanke M, 2013. Differential responses of trees to temperature variation during the chilling and forcing phases. Agricultural and Forest Meteorology 181, 33-42.

The chillR package:

Luedeling E, Kunz A and Blanke M, 2013. Identification of chilling and heat requirements of cherry trees - a statistical approach. International Journal of Biometeorology 57,679-689.

### Examples

1 2 3 4 5 6 7 8 9 10 | ```
weather<-fix_weather(KA_weather[which(KA_weather$Year>2004),])
#Plots look much better with weather<-fix_weather(KA_weather)
#but that takes to long to run for passing CRAN checks
dc<-daily_chill(stack_hourly_temps(weather,50.4), 11)
plscf<-PLS_chill_force(daily_chill_obj=dc, bio_data_frame=KA_bloom, split_month=6)
#PLS_results_path<-paste(getwd(),"/PLS_output",sep="")
#plot_PLS(plscf,PLS_results_path)
#plot_PLS(plscf,PLS_results_path,add_chill=c(307,19),add_heat=c(54,109))
``` |