knitr::opts_chunk$set(echo = TRUE)
## master version
version_raw <- RCurl::getURL("https://raw.githubusercontent.com/TGuillerme/dispRity/master/DESCRIPTION")
version_master <- strsplit(strsplit(version_raw, split = "Version: ")[[1]][2], split = "\nDate:")[[1]][1]
version_master <- 1.7

## release version
version_raw <- RCurl::getURL("https://raw.githubusercontent.com/TGuillerme/dispRity/release/DESCRIPTION")
version_release <- strsplit(strsplit(version_raw, split = "Version: ")[[1]][2], split = "\nDate:")[[1]][1]
version_release <- 1.7

## CRAN version
version_raw <- RCurl::getURL("https://cran.r-project.org/web/packages/dispRity/index.html")
version_CRAN <- strsplit(strsplit(version_raw, split = "\n<td>Version:</td>\n<td>")[[1]][2], split = "</td>\n</tr>\n<tr>\n<td>Depends:")[[1]][1]
version_CRAN <- 1.7

dispRity

This is a package for measuring disparity (aka multidimensional space occupancy) in R. It allows users to summarise matrices as representations as multidimensional spaces into a single value or distribution describing a specific aspect of this multidimensional space (the disparity). Multidimensional spaces can be ordinated matrices from MDS, PCA, PCO, PCoA but the package is not restricted to any type of matrices! This manual is based on the version r version_release.

What is dispRity?

This is a modular package for measuring disparity in R. It allows users to summarise ordinated matrices (e.g. MDS, PCA, PCO, PCoA) to perform some multidimensional analysis. Typically, these analysis are used in palaeobiology and evolutionary biology to study the changes in morphology through time. However, there are many more applications in ecology, evolution and beyond.

Modular?

Because their exist a multitude of ways to measure disparity, each adapted to every specific question, this package uses an easy to modify modular architecture. In coding, each module is simply a function or a modification of a function that can be passed to the main functions of the package to tweak it to your proper needs! In practice, you will notice throughout this manual that some function can take other functions as arguments: the modular architecture of this package allows you to use any function for these arguments (with some restrictions explained for each specific cases). This will allow you to finely tune your multidimensional analysis to the needs of your specific question!

Installing and running the package

You can install this package easily, directly from the CRAN:

install.packages("dispRity")

Alternatively, for the most up to data version and some functionalities not compatible with the CRAN, you can use the package through GitHub using devtool (see to CRAN or not to CRAN? for more details):

## Checking if devtools is already installed
if(!require(devtools)) install.packages("devtools")

## Installing the latest released version directly from GitHub
install_github("TGuillerme/dispRity", ref = "release")
## loading the package and setting up the start seed.
library(dispRity)
## Setting a random seed for repeatability
set.seed(123)

Note this uses the release branch (r version_release). For the piping-hot (but potentially unstable) version, you can change the argument ref = release to ref = master. dispRity depends mainly on the ape package and uses functions from several other packages (ade4, geometry, grDevices, hypervolume, paleotree, snow, Claddis, geomorph and RCurl).

Which version do I choose? {#version}

There are always three version of the package available:

The differences between the CRAN one and the GitHub release or master ones is explained just above. For the the GitHub version, the differences are that the release one is more stable (i.e. more rarely modified) and the master one is more live one (i.e. bug fixes and new functionalities are added as they come).

If you want the latest-latest version of the package I suggest using the GitHub master one, especially if you recently emailed me reporting a minor bug or wanting a new functionality! Note however that it can happen that the master version can sometimes be bugged (especially when there are major R and R packages updates), however, the status of the package state on both the release and the master version is constantly displayed on the README page of the package with the nice badges displaying these different (and constantly tested) information.

dispRity is always changing, how do I know it's not broken?

This is a really common a legitimate question in software development. Like R itself:

dispRity is free software and comes with ABSOLUTELY NO WARRANTY.

So you are using it at your own risk.

HOWEVER, there are two points that can be used as objective-ish markers on why it's OK to use dispRity.

First, the package has been use in a number of peer reviewed publications (the majority of them independently) which could be taken as warranty.

Second, I spend a lot of time and attention in making sure that every function in every version actually does what I think it is supposed to do. This is done through CI; continuous integration development, the CRAN check, and unit testing. The two first checks (CRAN and CI) ensure that the version you are using is not bugged (the CRAN check if you are using the CRAN version and the Travis CI if you are using a GitHub version). The third check, unit testing, is checking that every function is doing what it is supposed to do. For a real basic example, it is testing that the following expression should always return the same thing no matter what changes in the package.

> mean(c(1,2,3))
[1] 2

Or, more formally:

testthat::expect_equal(object = mean(c(1,2,3)),
                       expected = 2)

You can always access what is actually tested in the test/testthat sub-folder. For example here is how the core function dispRity is tested (through > 500 tests!). All these tests are run every time a change is made to the package and you can always see for yourself how much a single function is covered (i.e. what percentage of the function is actually covered by at least one test). You can always see the global coverage here or the specific coverage for each function here.

Finally, this package is build on the shoulders of the whole open science philosophy so when bugs do occur and are caught by myself or the package users, they are quickly fixed and notified in the NEWS.md file. And all the changes to the package are public and annotated so there's that too...

Help

If you need help with the package, hopefully the following manual will be useful. However, parts of this package are still in development and some other parts are probably not covered. Thus if you have suggestions or comments on on what has already been developed or will be developed, please send me an email (guillert@tcd.ie) or if you are a GitHub user, directly create an issue on the GitHub page.

Citations

To cite the package, this manual or some specific functionalities, you can use the following references:

The package main paper:

Guillerme T. dispRity: A modular R package for measuring disparity. Methods Ecol Evol. 2018;9:1755–1763. doi.org/10.1111/2041-210X.13022.

The package manual (regularly updated!):

Guillerme, T. & Cooper, N. (2018): dispRity manual. figshare. Preprint. 10.6084/m9.figshare.6187337.v1.

The time-slicing method implemented in chrono.subsets (unfortunately not Open Access, but you can still get a free copy from here):

Guillerme, T. and Cooper, N. (2018), Time for a rethink: time sub-sampling methods in disparity-through-time analyses. Palaeontology, 61: 481-493. doi:10.1111/pala.12364.

Furthermore, don't forget to cite R:

R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

Bonus: you can also cite ape since the dispRity package heavily relies on it:

Paradis E. & Schliep K. 2019. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35: 526-528.

Why is it important to cite us?

Aside from how science works (if you're using a method from a specific paper, cite that specific paper to refer to that specific method), why is it important to also cite the package and the manual?

All the people involve in making the dispRity package happened to do it enthusiastically, freely and most amazingly without asking anything in return! I created the package with this idea in mind and I am still sticking to it. However, academia (the institutions and people producing science around the globe) is unfortunately not optimal at many level (some might even say "broken"): high impact papers attract big grants that attract high impact papers and big grants again, all this along with livelihood, permanent position and job security. Unfortunately however, method development has a hard time to catch up with the current publish or perish system: constantly updating the dispRity package and this manual is hugely time consuming (but really fun!) and that is not even taking into account maintenance and helping users. Although I do truly believe that this time spent doing these things modestly help the scientific endeavour, it does not contribute to our paper list!

Therefore, by citing the package and this manual, you help provide visibility to other workers and you might help them in their work! And you directly contribute in making this project fun for all the people involved and most of all, free, updated and independent from the publish and perish system!

Thank you!



TGuillerme/dispRity documentation built on Dec. 21, 2024, 4:05 a.m.