An overview of making rangeModelMetadata objects

knitr::opts_chunk$set(echo = TRUE)
library(rangeModelMetadata)
library(sf)
library(spocc)
library(dplyr)

What is an rmm (rangeModelMetadata) object?

A simple rmm object is a list object that is structured to contain metadata pertaining to species range models. Here we make an empty rmm object containing only the obligate set of fields.

rmm1 <- rmmTemplate(family=c('base')) 
str(rmm1)

A more complex rmm object with all predefined fields. It may seem like a lot at first, but we're trying to keep you all happy, and many of these won't be needed. For example, you'll only be using one modeling algorithm, so you'll only be selecting one of the fields below under $model. Or, you may not use five different methods for cleaning geographic issues with your data under $dataPrep$geographic and so you'll end up omitting many of those fields. As we explore the hierarchy below, it'll seem simpler...

rmm2 <- rmmTemplate(family=NULL)
str(rmm2)

Populating an rmm object

rmm objects can be populated manually by entering data directly into the fields (see vignette('rmm_Multispecies',package='rangeModelMetadata')), or through the use of several helper functions. Although the rmm object template already contains a number of fields that depend on the specified family (see options with rmmFamilies()), users can also add new fields as needed. We provide suggestions of both common fields to add and common values for many fields.

Not sure which fields are available or what values to enter? We can suggest options.

rmmSuggest('dataPrep',fullFieldDepth=FALSE)
rmmSuggest('dataPrep',fullFieldDepth=TRUE) # for all fields below the specified one
rmmSuggest('dataPrep$biological$duplicateRemoval')
rmmSuggest('dataPrep$biological$duplicateRemoval$rule')

Here, it may help to learn the two pieces of unavoidable jargon we use: A field describes levels of the hierarchy. E.g., dataPrep is field 1, errors is field 2 and duplicateRemoval is field 3. An entity is described by a complete set of fields and has a particular value. In the example above 'dataPrep$errors$duplicateRemoval$rule is an entity, and it can take one of the four suggested values: "Environmental duplicates" "coordinate duplicates", "other (specify in Notes)", "NA".

Note above, that when you ask for suggestions for a field (the first three lines), you get suggestions of the relevant fields to consider. But the last line refers to to the lowest level in the hierarchy, an entity and so values are suggested.

Another more complex example:

rmmSuggest('model')
rmmSuggest('model$algorithm$maxent')
rmmSuggest('$model$algorithm$maxent$featureSet')

To make it easier to fill some rmm fields, we provide autofill functions that extract relevant information from common R objects used in a range modeling workflow.

rmm <- rmmTemplate()
rmm <- rmmAutofillPackageCitation(rmm,c('terra','sf'))
# search GBIF for occurrence data to demonstrate the autofill function
bv <- spocc::occ('Bradypus variegatus', 'gbif', limit=50, has_coords=TRUE)
rmm <- rmmAutofillspocc(rmm,bv$gbif)
# get some env layers to demonstrate the autofill function
rasterFiles <- list.files(path=paste(system.file(package='dismo'), '/ex', sep=''),
                       pattern='grd', full.names=TRUE)
# make a stack of the rasters
env <- terra::rast(rasterFiles)
rmm <- rmmAutofillEnvironment(rmm,env,transfer=0) # for fitting environment
# just using the same rasters for demonstration; in practice these are different
rmm <- rmmAutofillEnvironment(rmm,env,transfer=1) # for transfer environment 1
rmm <- rmmAutofillEnvironment(rmm,env,transfer=2) # for transfer environment 2 

To see what fields you might've missed...

empties <- rmmCheckEmpty(rmm)

Checking an rmm object

To check the field names in your object, use rmmNameCheck

# Make an empty template
rmm1 <- rmmTemplate() 
# Add a new, non-standard field
rmm1$dataPrep$biological$taxonomicHarmonization$taxonomy_source <- "The Plant List" # # Checking the names identifies the new, non-standard field we've added ("taxonomy_source")
rmm1 <- rmmCheckName(rmm1) 

This check identifies the entity $dataPrep$biological$taxonomicHarmonization$taxonomy_source as non-standard. It's non-standard because it's not an entity in the data dictionary, and I just added it. That's not a violation or anything bad, this check just let's you know it's there. This check can be useful to detect misspellings of standardized field names too.

To check the field values in your rmm object, use the function rmmValueCheck

#First, we create an empty rmm template
rmm1 <- rmmTemplate() 
#We add 3 of the bioclim layers, including a spelling error (an extra space) in bio2, and a word that is clearly not a climate layer, 'cromulent'.
rmm1$data$environment$variableNames <- c("bio1", "bio 2", "bio3", "cromulent") 
#Now, when we check the values, we see that bio1 and bio2 are reported as exact matches, while 'bio 2' is flagged as a partial match with a suggested value of 'bio2', and 'cromulent' is flagged as not matched at all.
rmmCheckValue(rmm = rmm1) 
#If we'd like to return a dataframe containing this information in a perhaps more useful format:
rmmCheckValueOutput <- rmmCheckValue(rmm = rmm1, returnData = TRUE)

These 'check' functions work by comparing the values or field names within an rmm object to those in a data dictionary. These functions are designed to check for non-standard values and names, and DO NOT necessarily identify correct vs. incorrect values/names. Non-standard values may be perfectly valid, or they may be erroneous, and the user will have to make this distinction.

To run all the available checks at once, we'll check the object that we filled in a few chunks back.

rmmCheckFinalize(rmm, family='base')

Outputting an rmm object

To make rmm objects portable to other interfaces, they are readily written to csv format.

outFile <- '~/Desktop/demo_rmmToCSV.csv'
rmmObj <- rmmTemplate()
rmmToCSV(rmmObj, filename=outFile)
system(paste0('open ', outFile, ' -a "Microsoft Excel"'))

Miscellaneous

It can be helpful to simply view the data dictionary:

dd <- rmmDataDictionary()
str(dd)
# rmmDataDictionary(excel=TRUE) # try this if you have excel


Try the rangeModelMetadata package in your browser

Any scripts or data that you put into this service are public.

rangeModelMetadata documentation built on Oct. 17, 2023, 1:11 a.m.