knitr::opts_chunk$set(echo = TRUE)

Kronos

If you use this software, please cite our work:

citation("kronos")

The following document is adapted from the supplementary materials for this manuscript.

0. Introduction

Here, we will demonstrate how to use the Kronos package to assess circadian rhythms in biological data collected over the day. For this demonstration we have adapted some data from our laboratory, currently undergoing peer review (reference to come).

We will demonstrate three common examples of experimental design currently encountered in circadian rhythms analysis:

  1. Circadian rhythm analysis of a single variable over a 24-hour period

  2. Circadian rhythm analysis of a single variable over a 24-hour period with two or more treatment groups

  3. Circadian rhythm analysis of omics (higher-dimensional) data over a 24-hour period with two or more treatment groups

All R code used to transform, reorganise and plot the data is also shown below to provide a toolkit for aspiring and veteran bioinformaticians alike. It should be noted that some variables in the demonstration data sets provided here have been manipulated to better demonstrate the functionality of this package.

At the end of this document, we have included a few excursions into more advanced subjects we find useful, but that did not necessarily fit in with the mainline analysis.

Code chunk: Install Kronos and load our libraries

#Install kronos
#install.packages("kronos") 

#Load relevant packages
library(kronos)
library(tidyverse)
library(ggplot2)

#Load prepared data stored in Kronos library
data("kronos_demo")

\newpage

Code chunk: Load our data tables

Data should be prepared in "long form" for use in Kronos; that is with values repeating in the "Timepoint" column, which defines when data was collected during the period, here a 24-hour cycle.

Our package includes example datasets that we will use in this tutorial that are pre-formatted. You can rearrange your data into long form using pivot_longer() or gather() from the tidyverse.

Since we're using prepared data, we already loaded it using data(kronos_demo). You can see how an example of how the data is prepared below:

head(groupdata)

Here we have prepared the omics data set with a separate metadata file as is common when working with omics data sets. A metadata file can be generated using select() in tidyr, or metadata and data can be combined using inner_join if they contain an identical column (both name and contents). The omics data set used here has been central-log transformed to account for its compositional nature (see our guide https://arxiv.org/abs/2207.12475 for easy centred-log transformation).

1. Analysing Rhythmicity in a Single Group

We will start with the most simple example: analysing circadian rhythmicity in a single experimental group for one outcome variable of interest. For this we use the kronos function:

output <- kronos(formula = Variable_1 ~ time(Timepoint), 
                 data = onevariable, 
                 period = 24, 
                 verbose = T, 
                 pairwise = F)

Here we use the formula Outcome Variable ~ time(Time Variable), which is the most simple model used by the kronos function. We specify the period as 24 (this can be adjusted as appropriate for the data analysed). By selecting verbose=T, you will be able to see the models run by the kronos function: this becomes increasingly useful when you run more complex models. Finally we select pairwise=F here, as there are no groups to compare for differences in rhythms.

The kronos function returns a kronosOut object, containing several pieces of data that can be accessed using handy 'getter' functions, which we will describe below:

1). The getKronos_input() function fetches the data that the model is based on, as well as the calculated cosine and sine components.

head(getKronos_input(output))

2). The getKronos_fit() function fetches the key details for the generated model that may be useful for prediction, modelling and other statistical applications.

getKronos_fit(output)

3). The getKronos_trace() function returns all the data required for graphing the sinusoid curve, which can either be used in our specialized ggplot2 functions, or can be used in other graphing packages. The y_hat column represents the predicted value of the outcome variable: this is essential for plotting the predicted sinusoid curve.

head(getKronos_trace(output))

4). The getKronos_groupwise() function arguably fetches the most useful output: this provides us with the p-value (p.val) and proportion of the variance in the data explained (r.sq) when we fit our sinusoid curve. Additionally we obtain the acrophase (acro) and amplitude of the predicted curve, which can be used in our graphics functions to visualise changes in curve with interventions (see gg_kronos_circle, explored further below).

getKronos_groupwise(output)

Figures

The package contains custom ggplot2 figure functions, that utilise the kronos output to rapidly produce figures that convey important information for circadian rhythms:

  1. gg_kronos_circle() generates a plot showing the acrophase and amplitude of the predicted curve, allowing the reader to rapidly access summary data regarding variables of interest, and to compare the summary data between groups in more complex models.At baseline, non-significant outcome measures are presented using dashed lines.
  2. gg_kronos_sinusoid() generates a x-y plot showing the outcome variable across the defined period. These graphs are useful for visualising the differences between specific timepoints assessed.
gg_kronos_circle(output)

gg_kronos_sinusoid(output)

2. Comparing Rhythmicity for More than Two Groups

Next we will demonstrate one of the unique features of the Kronos package: the ability to compare circadian rhythms between more than two groups. This is increasingly important as the use of complex experimental designs grows in biological science. This example comprises of three independent groups and is similar in setup to a one-way ANOVA. For examples of more complex designs, see Excursion 1.

output2 <- kronos(formula = Variable_1 ~ Treatment + time(Timepoint), 
                  data = groupdata, 
                  period = 24, 
                  verbose = T, 
                  pairwise = T)

gg_kronos_circle(output2)

gg_kronos_sinusoid(output2)

There are a few changes to the output generated by the kronos function:

  1. The getKronos_groupwise() output now contains a line for each of our groups.
getKronos_groupwise(output2)

In this example you can see that groups A and B exhibit statistically significant rhythms, while the model fitted to group C is non-significant.

  1. We can now generate pairwise models output using pairwise=T. This generates pairwise comparisons between each of the groups:
getKronos_pairwise(output2)

Above we can see that overall group A is significantly different between B and C, and that group B exhibits a significantly different rhythm from A and C.

  1. When including independent variables, kronos will also calculate an overall interaction with time which can be accessed using the code below:
getKronos_pairwise_p(output2)

This is calculated by performing a Bonferroni correction on the interactions between both the sine and cosine time components and the independent variable. The p-value reported is the lowest following correction.

3. 'Omics Analysis

Now we will demonstrate how to adapt the kronos package for 'omics analysis, where there are many outcome variables. We have written the fw_kronos (feature-wise) function specifically for this purpose. This function behaves very similar to the main kronos function.

It requires two core types of data: First, a table of data with rows as features and columns as samples as input. Make sure that the feature labels are row names rather than a column. Second, a metadata table with rows as samples and metadata entries as columns. Suitable input data looks like this:

#plot a little of how a big omics data should be formatted
head(bigdata, n = c(5, 5))

#plot a little of the metadata should be formatted
head(bigmeta)

From the user's perspective, fw_kronos reads very similar to the main kronos function. The core difference is that fw_kronos accepts data and metadata as separate objects. Furthermore, fw_kronos doesn't need a term on the left-hand side of the formula. It will automatically, sequentially apply the given formula with each feature in the input table as the response variable.

out_list = fw_kronos(x = bigdata, 
                     formula = ~ Group + time(Timepoint), 
                     metadata = bigmeta, 
                     period = 24,
                     verbose = F,
                     pairwise = T) 

Now we have a list of kronosOut objects, which contain all our results. This can be cumbersome to do manually, so we wrote the kronosListToTable function for this purpose:

fit_df = kronosListToTable(out_list)

write.csv(fit_df, "README_files/RhythmicityResults.csv")

This will generate a csv containing the individual rhythmicity calculations for the whole data set with an FDR correction to account for multiple tests.

The resulting csv can be found here.

#plot a little of fit_df
head(fit_df, n = c(5, 5))

We can use a similar approach to obtain other components of the kronosOut objects. Below we include the code to obtain the pairwise comparisons as a single csv, as this is slightly more difficult.

#Create an empty container list of the appropriate length
pairwise_list = vector(mode = "list", length = length(out_list)) 

#The for-loop below generates a list containing the pairwise test results
for(m in 1:length(out_list)){
  pairwise_list[[m]] <- out_list[[m]]@pairwise_models
  names(pairwise_list)[m] <- names(out_list)[m]

}

#Generate a single bound list
bound_list <- lapply(X = pairwise_list, FUN = function(x){do.call(rbind, x)})

#collapse the bound list for each variable into a single dataframe 
pairwise_df <- do.call(rbind, bound_list)

#separate the comparison and the effects to make results more readable
pairwise_csv = pairwise_df %>% 
  rownames_to_column("ID") %>%
  separate(col = ID, into = c("Feature","ID"), sep = "\\.", extra = "merge") %>%
  separate(col = ID, into = c("Comparison","Effect"), sep = "\\.")


write.csv(pairwise_csv, "README_files/PairwiseResults.csv") 

The resulting csv can be found here.

Figures

gg_kronos_acrogram() is a visualisation function we have designed specifically for omics datasets. This function provides a polar histogram of your dataset's acrophases. This allows you to compare overall rhythmicity between groups. Below we can see that a large proportion of the variables peak between ZT20-23 in Group A, while the variables in Groups B and C are less synchronous.

gg_kronos_acrogram(out_list)

We can also use automation to obtain individual plots for our omics data set. Here we will demonstrate how to obtain sinusoid curves for each outcome measure in the data set.

 #Create an empty container list of the appropriate length
plot_list <- vector(mode = "list", length = length(out_list))

for(q in 1:length(out_list)){

  #save plot into relevant position in list
  plot_list[[q]] <- gg_kronos_sinusoid(out_list[[q]]) 

}

#to plot & save the feature graphs to a pdf:
pdf("README_files/plots_circadian.pdf")
for (i in 1:length(plot_list)) {
  print(plot_list[[i]])
}
invisible(dev.off())

The resulting pdf can be found here. The same approach can be applied for obtaining individual circle figures.

4. Discussion

Here we have presented standard data for the analysis of time-of-day. Some points to consider for your data is whether you can assume a 24-hour period, and whether your data is evenly distributed over the period. Please note that you will require a minimum of three data points over your period to make use of these functions, and indeed any function using sinusoid models. Currently the kronos package is not able to estimate period; a wide range of packages are capable of determining your period if this is necessary for your research question. Please note that period estimation requires even more temporal resolution: some recommend a minimum of sampling every 2 hours over a 48-hr window (Hughes et al., 2017, doi: 10.1177/0748730417728663).

This tutorial is merely a template. Depending on your experimental set-up, findings and experimental questions you may need to adjust your approach. However, as complex statistical models become more frequent in the study of circadian rhythms, functions that can incorporate more complex design than two-group comparisons are essential for advancement of the field.

We have provided figure generation functions as clear communication of results is essential to producing good and useful science. Please see below for more details for figure customisation. We hope that both aspiring and veteran bioinformaticians and circadian rhythm biologists will find our guide helpful.

Excursion 1. Customising Figures

The two figure functions, gg_kronos_circle() and gg_kronos_sinusoid(), are designed to be fully compatible with ggplot2 and therefore are fully customisable. Below is an example of how one could customise kronos plots using ggplot2 syntax

gg_kronos_circle(output2) +
  scale_fill_manual(values = c("A" = "#169B62",
                               "B" = "#FFFFFF", 
                               "C" = "#FF883E")) +
  ggtitle("Figure title")

We encourage users to take advantage of the extensive range of online tutorials and add-on packages available for ggplot2.

Excursion 2. Assessing Rhythmicity in More Complex Models

One of the key features of this package is the use of a formula input, which allows for analysis of complex models. Below we will demonstrate how kronos performs when assessing data with two independent variables.

data3 <- twowaydata

two.way.data.long <- data3 %>%
  pivot_longer(cols=starts_with("Variable_"), names_to = "Variables", values_to = "Value") %>%
  as.data.frame()

#collect all outcome variables
two_way_data_names <- unique(two.way.data.long$Variables) 

#Create an empty container list of the appropriate length
data.list <- vector(mode = "list", length = length(two_way_data_names)) 

for(n in 1:length(two_way_data_names)){
  data.list[[n]] <- two.way.data.long %>% filter(Variables == two_way_data_names[n])
}

two_way_out_list = lapply(X = data.list, 
                          FUN = function(y){kronos(data = y, 
                                                   Value ~ Factor_A*Factor_B + time(Timepoint), 
                                                   period = 24, pairwise = T, verbose = F)
                            }
                          )

names(two_way_out_list) <- two_way_data_names


gg_kronos_sinusoid(two_way_out_list$Variable_1)

gg_kronos_sinusoid(two_way_out_list$Variable_2)

gg_kronos_sinusoid(two_way_out_list$Variable_3)

gg_kronos_sinusoid(two_way_out_list$Variable_4)

Here we have analysed 4 outcome variables which all show different interaction effects. Here we will go into depth examining the effects observed in Variable 1 as an example of how to interpret kronos output for more complex designs.

gg_kronos_sinusoid(two_way_out_list$Variable_1)

gg_kronos_circle(two_way_out_list$Variable_1)

1). As before, we can use the getKronos_groupwise() function to obtain individual rhythmicity for each group. Here you can see that 3/4 groups exhibit rhythmicity and that both conventional groups share a similar acrophase (which is illustrated in the figures above as well).

getKronos_groupwise(two_way_out_list$Variable_1)

2). With the getKronos_pairwise_p() function we can assess the interaction of each of our experimental factors with the time component of the model: here you can see that both main effects and the interaction significantly interact with the time component.

getKronos_pairwise_p(two_way_out_list$Variable_1)

3). Next we can use the getKronos_pairwise() function to obtain the pairwise group comparisons. This allows us to determine how each group differs from one another. For example, here you can see that Conventional+Stress and Antibiotics+Control only exhibit a significant group*Timepoint_sin interaction. This is unsurprising as the groups have the same average value but exhibit a rhythm shifted by 12 hours.

getKronos_pairwise(two_way_out_list$Variable_1)

Session Info

sessioninfo::session_info()


thomazbastiaanssen/kronos documentation built on May 1, 2023, 6:45 p.m.