This is the code that goes with the dispRity
workshop.
You can use it to follow along with the workshop instructions and modify the code to use your own data or adjust your own options.
Most of the information about the workshop comes from the dispRity
manual.
You can also always find more documentation about the functions used here using the R
inbuilt manual by typing ?function.name
.
R
levelIn this workshop I will assume you are already familiar with basic R
. The basic notions that I'll assume you know are:
ape
or dispRity
)this_object <- 1
)"matrix"
or "phylo"
)mean(c(1,2))
)?mean
)R
(e.g. using read.csv
)Let's get into it. First we'll want to download and install the package:
library(dispRity)
To follow this tutorial you can use the inbuilt dispRity
:
## Loading the inbuilt data data(BeckLee_mat50) data(BeckLee_tree) ## The matrix my_matrix <- BeckLee_mat50 ## The tree my_tree <- BeckLee_tree
But you can always use your own data. You'll need:
For making it easier to follow the tutorial you can name your matrix my_matrix
and your tree my_tree
. If you're missing either the tree or the matrix, you can find below two function to simulate either:
WARNING: the data generated by the functions
i.need.a.matrix
,i.need.a.tree
,i.need.node.data
andi.need.FADLAD
are used to SIMULATE data for this tutorial. This is not to be used for publications or analysing real data! If you need a data matrix, a phylogenetic tree or FADLAD data, (i.need.a.matrix
,i.need.a.tree
andi.need.FADLAD
), you will actually need to collect data from the literature or the field! If you need node data, you will need to use ancestral states estimations (e.g. usingestimate_ancestral_states
from theCladdis
package).
## Functions to get simulate a PCO looking like matrix from a tree i.need.a.matrix <- function(tree) { matrix <- space.maker(elements = Ntip(tree), dimensions = Ntip(tree), distribution = rnorm, scree = rev(cumsum(rep(1/Ntip(tree), Ntip(tree))))) rownames(matrix) <- tree$tip.label return(matrix) } ## Function to simulate a tree i.need.a.tree <- function(matrix) { tree <- rtree(nrow(matrix)) tree$root.time <- max(tree.age(tree)$age) tree$tip.label <- rownames(matrix) tree$node.label <- paste0("n", 1:(nrow(matrix)-1)) return(tree) } ## You can verify if both the matrix and the tree labels match using clean.data(my_matrix, my_tree)
This is the code for the simple disparity per group analysis. The hypothesis we will test will be: "Is the crown mammal's morphospace bigger than stem mammal's one?"
## Creating a list of groups my_groups <- crown.stem(my_tree, inc.nodes = FALSE) ## Creating a dispRity object that contains group information trait_space <- custom.subsets(my_matrix, group = my_groups) ## Bootstrapping trait_space_bs <- boot.matrix(trait_space)
This is a tricky one and depends on your question, but thankfully we can ask moms
to help us!
Check out the documentation here.
For that we can simply upload our trait space in moms
and test our different metrics to measure our different hypotheses.
## Saving our space as a matrix (with column names and row names) write.csv(my_matrix, file = "moms_space.csv")
Here for our hypotheses, we are interested in volume or size of the trait space occupied (i.e. "Is the crown mammal's morphospace bigger than stem mammal's one?") and we can see that the sum of variance metric works relatively well to capture changes in volume.
## Measuring disparity disparity_groups <- dispRity(trait_space_bs, metric = c(sum, variances)) ## Summarizing the results summary(disparity_groups) ## Plotting the results plot(disparity_groups) ## Testing the difference between both groups test.dispRity(disparity_groups, test = t.test)
If you're using a good text editor to write your research (e.g. LaTeX
, markdown, etc... not Word), you can use the the knitr::kable
function to directly print out the summary tables in a format for your research project. For example for this markdown vignette:
knitr::kable(summary(disparity_groups))
This section here will generate some data for your nodes in the tree. DO NOT TRY THIS AT HOME. Or, less patronisingly, we will use some within-PCA ancestral states estimations, there are some cases when this method is totally valid but this is not always the case. You can find our opinion on this paper Section 3-d. I will personally be really cautious about using this approach if you don't know what you are doing (but you should do what you think is best for your question and data - don't list to me).
## Wrapping function for running a quick and dirty ancestral character estimation quick.n.dirty.ace <- function(matrix, tree) { apply(matrix, 2, function(trait, tree) return(ape::ace(trait, phy = tree)$ace), tree = tree) } ## Running the ACE node_traits <- quick.n.dirty.ace(my_matrix, my_tree) ## Combining both matrices my_matrix_nodes <- rbind(my_matrix, node_traits) ## Adding node labels to the tree my_tree$node.label <- rownames(node_traits) ## Adding a root time to the tree (if missing) if(is.null(my_tree$root.time)) { my_tree$root.time <- max(tree.age(my_tree)$age) }
## Creating a time-sliced trait space trait_space <- chrono.subsets(my_matrix_nodes, tree = my_tree, method = "continuous", model = "proximity", time = 9) ## Bootstrapping the data trait_space_bs <- boot.matrix(trait_space) ## Measuring disparity disparity_time <- dispRity(trait_space_bs, metric = c(sum, variances)) ## Summarizing the results summary(disparity_time) ## Plotting the results plot(disparity_time) ## Testing the difference between both groups test.dispRity(disparity_time, test = wilcox.test, comparisons = "sequential", correction = "bonferroni")
The plot.dispRity
function has many, many different functions. Many of them are your normal plot
options (main
, etc...) so check out the manual to make some more fancy plots.
Alternatively, you can just extract the disparity data and
## Some fancier plot plot(disparity_time, main = "Rainbows!", cent.tend = sd, quantiles = seq(from = 1, to = 99, by = 1), col = c("black", rainbow(100)), las = 2, ylab = "My non descriptive disparity metric") legend("topleft", lty = 1, col = "black", legend = "The standard deviation")
If you are a ggplot
er, you can extract the data from the dispRity
object using the get.disparity
function. If you like ggplot
and the dispRity
package and feel like collaborating with me, please drop me an email so that we can figure out a way to make a ggplot
module to the package (and make you an author!).
dtt
analysis## Some disparity through time analysis dispRity_dtt <- dtt.dispRity(data = my_matrix, metric = c(sum, variances), tree = my_tree, nsim = 100) ## Print the data dispRity_dtt ## Plot the data plot(dispRity_dtt)
## Is there an effect of the factor "group" in the data matrix? adonis.dispRity(disparity_groups) ## Is there an effect of the factor "time" in the data matrix? adonis.dispRity(disparity_time)
## Is the data normally distributed in each group? results <- null.test(disparity_groups, replicates = 100, null.distrib = rnorm) plot(results) ## Is the data log-normally distributed through time? results <- null.test(disparity_time, replicates = 100, null.distrib = rlnorm) plot(results)
Note that this is an unpublished method. Please contact me (guillert@tcd.ie) if you are interested in using it in your study.
## Testing the fit of different modes of disparity changes through time model.test.wrapper(disparity_time, model = c("BM", "OU", "EB", "Trend"))
dispRity
utility functionsThe package also provides a range of functions for modifying dispRity
objects or extracting some useful information from it (e.g. on specific bootstrapped matrix, etc...). More info here.
The package also provides some other functions that I find useful in disparity analysis but are not totally linked to disparity analysis. For example, a function for removing zero branch lengths from trees (remove.zero.brlen
) or a function to calculate various matrix distance metrics (char.diff
). More info here.
Claddis
package with dispRity
Some of you might be familiar with Graeme Lloyd's rich Claddis
package and might want to use that in your analysis.
Great news, there is a simple function in dispRity
that can convert Claddis
objects into dispRity
objects Claddis.ordination
.
The way both me and Graeme see the synergy between both packages is that you can do a lot of analyses before measuring disparity in the Claddis
package (e.g. ordinating data, measuring distance matrices, looking at rates, etc...) and then export the trait space generated in Claddis
in dispRity
for the analyses involving disparity metrics.
Of course you do you and you can mix and match your analyses and your packages the way you want.
For those that don't use/know Claddis
, here is a quick pipeline on how to analyse your data starting from a cladistic matrix:
library(Claddis) # based on v0.6.3 or higher ## previous version of Claddis were quite different ## and won't work here
You can either input your own nexus matrix:
## Reading your cladistic (NEXUS) matrix my_nexus <- read_nexus_matrix(file_name)
Or a matrix already present in R
with species as rows and characters as columns:
## Create random 10-by-50 matrix containing NAs (inapplicable) ## polymorphisms (&) and missing data ("") character_taxon_matrix <- matrix(sample(c("0", "1", "0&1", NA, ""), 500,replace = TRUE ), nrow = 10, ncol = 50) rownames(character_taxon_matrix) <- LETTERS[1:10] ## Reformat for use elsewhere in Claddis: my_nexus <- build_cladistic_matrix(character_taxon_matrix)
You can then measure your distance matrix anyway you want (e.g. using Graeme's MORD distance) and then ordinate the distance matrix using a PCoA (NMDS):
## Calculate morphological distances mord_distances <- calculate_morphological_distances(my_nexus) ## Ordinating the distances ordinated_distances <- cmdscale(mord_distances$distance_matrix)
There is much more the the distance and ordination methods and I suggest you check out directly Graeme's paper for more info. Regardless you can then feed the ordinated matrix directly into dispRity:
## Calculating the sum of variances: dispRity(ordinated_distances, metric = c(sum, variances))
Alternatively, you can do all the pipeline automatically using the Claddis.ordination
function:
## Running the ordination automatically to create a dispRity object dispRity(Claddis.ordination(my_nexus), metric = c(sum, variances))
The dispRity
package is constantly updated with new functionalities, bug fixes and improved functions. You can follow the development of the package by following me on github (TGuillerme - I also develop other ecology-evolution packages) or twitter (@TGuillerme - I also tweet about other work related stuff).
You can check what will be happening in the next CRAN version of the package by checking the NEWS.md on the master branch.
And of course, you are more than welcome to contact me if you have suggestions, ideas or wishes for improvements!
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.