knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This is a demonstration of one use for some of the kewr resources, in calculating the proportion of species that have been assessed in a country.
To do this, we'll follow these steps:
In addition to kewr, we'll load:
library(kewr) library(dplyr) library(tidyr) library(stringr) library(ggplot2)
The first step in this problem is to get a list of assessed species for a country.
The most authoritative list of global species assessments is the IUCN Red List of Threatened Species. For this package, I used the rl_sp_country
function from rredlist to request all assessments for taxa found in Denmark from the IUCN Red List API. I then used the rl_search
function to request full assessment information for each taxon.
You can do the same by installing the rredlist package. You'll need an API key to use the IUCN Red List API, which you can register for here. Alternatively, you can download the information you need directly from the IUCN Red List website.
I've bundled up the Danish plant assessments in this package to make things easier for this analysis.
head(danish_plants)
In total, there are assessments for r format(nrow(danish_plants), big.mark=",")
vascular plants from Denmark.
To match these names to an IPNI ID, we'll use the Kew Names Matching Service. The first thing we'll do, is join the taxonomic authority to the scientific name, to hopefully reduce the number of matches for each name.
danish_plants <- danish_plants %>% unite("full_name", scientific_name, authority, sep=" ", remove=FALSE)
Then we put the full names through KNMS.
full_matches <- match_knms(danish_plants$full_name) full_matches
Most of our names had matches! Which makes things simpler. But we also got some names that returned multiple matches.
First we'll try putting just the scientific names of our missing matches through KNMS again.
full_matches <- tidy(full_matches) unmatched <- filter(full_matches, !matched) to_match <- danish_plants %>% filter(full_name %in% unmatched$submitted) part_matches <- match_knms(to_match$scientific_name) part_matches
Almost everything returned a match. We could attempt to manually match the final name, but as it's just one we'll leave it out.
Now we can join all of our matches together, link them to the IUCN taxon ID, and resolve any synonyms.
part_matches <- tidy(part_matches) full_matches <- full_matches %>% filter(matched) %>% left_join( danish_plants %>% select(taxonid, full_name, category), by=c("submitted"="full_name") ) part_matches <- part_matches %>% filter(matched) %>% left_join( danish_plants %>% select(taxonid, scientific_name, category), by=c("submitted"="scientific_name") ) matched_names <- bind_rows(full_matches, part_matches) head(matched_names)
Now that we have an IPNI ID attached to each assessment, we can look up the record for the taxa in WCVP. This will let us find out the taxonomic status of each name - the first step in resolving any synonymy issues.
# wrap up the lookup_wcvp function to make sure it comes back as a list f <- function(taxonid) { list(lookup_wcvp(taxonid)) } resolved_names <- matched_names %>% nest_by(taxonid, ipni_id) %>% mutate(wcvp_record=f(ipni_id)) resolved_names <- resolved_names %>% mutate(status=wcvp_record$status)
With the taxonomic status, we'll first remove any taxa that are unplaced. We'll also remove any non-homotypic synonyms - even if we resolve these to accepted species, we can't be sure that the assessment would be valid for the new concept, so that accepted species would not be assessed.
resolved_names <- resolved_names %>% filter(status %in% c("accepted", "homotypic synonym"))
Next we'll check if there are any taxa that still have multiple matches in WCVP.
resolved_names %>% ungroup() %>% add_count(taxonid) %>% summarise(multiple_matches=sum(n > 1))
There are not.
So the final step is to find the accepted names for all homotypic synonyms and remove anything that is a lower rank than species.
resolved_names <- resolved_names %>% mutate(accepted_id=ifelse(status != "accepted", wcvp_record$accepted$id, wcvp_record$id), accepted_name=ifelse(status != "accepted", wcvp_record$accepted$name, wcvp_record$name), accepted_author=ifelse(status != "accepted", wcvp_record$accepted$author, wcvp_record$authors), accepted_rank=ifelse(status != "accepted", wcvp_record$accepted$rank, wcvp_record$rank)) resolved_names <- resolved_names %>% select(-wcvp_record) %>% unnest(cols=c(data)) %>% rename(match_id=ipni_id) accepted_species <- resolved_names %>% filter(accepted_rank == "Species") %>% ungroup() nrow(accepted_species)
We are now left with r nrow(accepted_species)
accepted species from Denmark with assessments.
To calculate the number of species that are assessed in Denmark, we need a checklist of all accepted species.
We can get this from Plants of the World Online.
checklist <- search_powo(list(distribution="Denmark"), filters=c("accepted", "species"), limit=5000) checklist
Now we have this, we just need to join our assessments to our checklist.
checklist <- tidy(checklist) %>% mutate(ipni_id=str_extract(fqId, "[0-9\\-]+")) %>% select(ipni_id, family, name, author) %>% distinct(ipni_id, .keep_all=TRUE) %>% left_join( accepted_species %>% select(accepted_id, category), by=c("ipni_id"="accepted_id") )
And now we can calculate the proportion of species assessed in Denmark!
checklist %>% summarise(p_assessed=mean(! is.na(category)))
And make a simple bar chart of the number of species in each category.
iucn_colours <- c("NE"="#ffffff", "DD"="#d1d1d6", "LC"="#60c659", "NT"="#cce226", "VU"="#f9e814", "EN"="#fc7f3f", "CR"="d81e05", "EW"="#542344", "EX"="#000000") checklist %>% replace_na(list(category="NE")) %>% mutate(category=factor(category, levels=names(iucn_colours), ordered=TRUE)) %>% ggplot(mapping=aes(y=category, fill=category, colour=category == "NE")) + geom_bar() + scale_fill_manual(values=iucn_colours, drop=FALSE) + scale_colour_manual(values=c(`TRUE`="black", `FALSE`=NA)) + scale_y_discrete(drop=FALSE) + guides(fill=FALSE, colour=FALSE) + labs(x="Number of species", y="IUCN Red List category")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.