The organization of functional biological infomation is well behind that of taxonomy. For example we have a good idea about the taxonomy of about 300,000 higher plant species, but only functional information about a much smaller number. In the case of woody-versus-herbaceous for plants this is about 50,000 species. Making the matter worse the information we do have is a biased sample--it's usually disproportionally about temperate, economically important, and pretty species. Until functional information catches up with taxonomy, this is a package designed to use taxonomic information to estimate the missing binary trait values.

For example there are 11 known species of the genus Alisma. We have documentation that 6 of them are herbaceous. The question is what is the probablity of the remaining, 5 un-documented species are indeed herbaceous. Because we know about the 6 species and because most genera are not polymorphic for this trait, the probability is close to, but not exactly equal to, 1.

This package takes two approaches to estimating those species (for methods details see FitzJohn et al. 2014), which here are accessed via the option with_replacement. The strong prior option described by FitzJohn et al. (2014) corresponds to with_replacement=TRUE; the weak prior corresponds to with_replacement=FALSE.

This gives us a taxonomically informed picture of this functional trait. The estimates can be examined at the genus, family, order, or global level. In many empirical cases (including the one discussed by FitzJohn et al. 2014) in turns out that due to sampling bias, the raw estimate for a trait is very different than the taxonmically informed one.

Data

Very often trait data has phylogenetic (and taxonomic) structure. This can be seen in the binary woodiness data. Here is a plot of the proportion of species within 465 families that are woody. You can see the strong bimodality with only a few families containing species that are both woody and herbaceous. This type of pattern may be stronger or weaker for different traits (and different taxonomies).

library(traitfill)
wood<-load_wood()
number.woody.species<-tapply(wood$W,wood$family,FUN=function(x)sum(x))
number.herb.species<-tapply(wood$H,wood$family,FUN=function(x)sum(x))
prop.woody<-number.woody.species/(number.herb.species+number.woody.species)
hist(prop.woody,xlab="Proportion of woody species within families",main="")

Example

To fill in the missing trait values using taxonomic information use the function traitfill

library(traitfill)
## Data from "How much of the world is woody?"
wood <- load_wood()
## Fill in missing woodiness data; this says that the "H" column
## is state 0 and the "W" column is state 1, so that the generated
## percentages are "percentage woody" and not "percentage herbacious.
res <- traitfill(wood, 50, names=c("H", "W"))
str(res)
## The overall estimate is:
res$overall$p_mean

The output is a list of dataframes. The first element of the list contains the genus-level estimate; the second element has the family-level estimates; the third element has the order-level estimates; and the fourth has the overall estimate for the entire taxon. In the case of the example this is vascular plants.

References

FitzJohn, R. G., Pennell, M. W., Zanne, A. E., Stevens, P. F., Tank, D. C., Cornwell, W. K. (2014), How much of the world is woody?. Journal of Ecology, 102: 1266–1272. doi: 10.1111/1365-2745.12260

Zanne, Amy E., et al. (2014), Three keys to the radiation of angiosperms into freezing environments. Nature 506.7486 (2014): 89-92.



traitecoevo/traitfill documentation built on May 31, 2019, 7:44 p.m.