clean.MADtraits: Cleaning MADtraits objects

Description Usage Arguments Value Author(s) See Also Examples

View source: R/cleaning.R

Description

A very useful, and very much recommended, function for 'cleaning' MADtraits data before serious use! It provides three kinds of potential cleaning: of the trait names (e.g., harmonising datasets so that "sla" and "specific_leaf_area" are recognised as the same trait), of species' names (correcting some typos and name changes using 'taxize'), and of trait units (e.g., harmonising across datasets such that masses are all in the same unit). It is *strongly recommended* that you perform some kind of cleaning of MADtrait data before using it. The logic of the MADworld is to make it easy for you to get data, and then transparent how that data has been cleaned and managed downstream. We make no guarantee that the decisions we have made in terms of cleaning are the "best" - please feel free to use this code as a starting point, and improve from there!

Usage

1
2
3
4
5
6
7
clean.MADtraits(
  x,
  option = c("traits", "species", "units", "everything"),
  taxon.cache = TRUE,
  taxon.thresh = 0.8,
  unit.choices = NA
)

Arguments

x

MADtraits object

option

What cleaning to perform: focusing on trait variable naes (DEFAULT; "traits"), species' taxonomic names ("species"), data units ("units"), or all three at once ("everything").

taxon.cache

Whether to use MADtraits' internal cache of taxonomic lookup information for cleaning species' names (default: TRUE) or to build one from scratch at run-time using taxize (set as FALSE). Building from scratch is a very slow process! You can also pass a 'lookup' character vector that contains species' current (messy) names as the names element, and the clean (correct) names as the main entries. If you look at the code (which is short), this allows clean.MADtraits to run the equivalent of lookup[raw_names] to get the new, 'clean' names.

taxon.thresh

Threshold of certainty to be used as a minimum when assigning new names to a species when building a lookup from scratch (see taxon.cache). The default, of 0.8, has not been chosen with any particular intelligence.

unit.choices

Named vector of units, where the names are variables and the values are the units you would like that unit in. See examples - this isn't as confusing as it sounds. Units should be given in standard scientific notation - see convert for more details.

Value

MADtraits object

Author(s)

Will Pearse

See Also

convertunits.MADtraits taxonlookup.MADtraits

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Grab some example data
# - note that you should work with the output from the MADtraits function
# - since "cleaning" a single dataset doesn't achieve very much!
demo <- .cavenderbares.2015a()
MADtraits(datasets=c)
# Clean trait names (the default)
clean.MADtraits(demo)
# Clean species' names
clean.MADtraits(demo, "species")
# Clean units
clean.MADtraits(demo, "units")
# Clean it all!
clean.MADtraits(demo, "everything")

willpearse/natdb documentation built on April 7, 2020, 8:33 a.m.