find_trait: Find fungal records with specific trait data

View source: R/find_trait.R

find_traitR Documentation

Find fungal records with specific trait data

Description

Find trait-relevant records (i.e., records with a particular substrate, host, or habitat association) in a data set of fungal collections/observations (e.g. MyCoPortal or GBIF data sets).

Usage

find_trait(
  data,
  metadata_cols = c("habitat", "occurrenceRemarks", "associatedTaxa"),
  pos_string,
  neg_string = NULL,
  string_clean = TRUE
)

Arguments

data

Data.frame containing columns of trait-relevant metadata. (e.g. Darwin Core archive file)

metadata_cols

Character vector containing names of columns with trait-relevant metadata. Default (c("habitat", "occurrenceRemarks", "associatedTaxa")) is based on fields in Darwin Core archive files that typically contain trait-relevant metadata.

pos_string

Character string ("positive") containing a regular expression that is used to find character strings within the specified metadata columns of the input data frame that contain trait-relevant keywords or phrases.

neg_string

Character string ("negative") containing a regular expression that is used to remove records that were falsely identified, via the "positive" search string, as being trait-relevant. This argument is optional.

string_clean

Logical. If TRUE (the default), strings in metadata_cols are "cleaned" prior to trait searching. This includes converting strings to lowercase and removing any punctuation (e.g., periods, commas, question marks, etc.) or extra white space.

Value

      Returns a data.frame of records with trait-relevant metadata.

Note

Fields containing trait-relevant metadata may vary by data set, so metadata_cols should be optimized accordingly.

Examples

library(fungarium)
data(agaricales_updated) #import sample data set

#Finds fire-associated records
string1 <- "(?i)charred|burn(t|ed)|scorched|fire.?(killed|damaged|scarred)|killed.by.fire"

#Removes records falsely identified as fire-associated
string2 <- "(?i)un.?burn(t|ed)"

#find trait-relevant records
trait_rec <- find_trait(agaricales_updated, pos_string=string1, neg_string=string2)

hjsimpso/fungarium documentation built on Aug. 23, 2023, 3:59 p.m.