dataQC.eventStructure: check a dataset for an event-structure

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/DataQC_Utils.R

Description

checks if an eventID is present in a (QC'd) dataset, and generates one if not. Does the same of parentEventID if check.parentEventID is TRUE, and check the hierarchical relationships between eventID and parentEventID

Usage

1
2
3
dataQC.eventStructure(dataset, eventID.col = "eventID", 
  parentEventID.col = NA, project.col = NA, project = NA, 
    event.prefix = NA, complete.hierarchy=FALSE)

Arguments

dataset

data.frame. The dataset for which the event structure should be checked

eventID.col

character. The column where the names of the events are given. Default eventID. If NA, and event.prefix must be provided to be able to create unique eventIDs

parentEventID.col

character. The column where the names of the parentEvents are given. If NA, parentEvents are not considered. Default NA

project.col

character. The column where the names of the projects are given. Projects are high-level parentEvents that group a large number of events. If NA, projects are not considered. If there is just one project, please consider the project parameter. Default NA. This parameter overrides the project parameter. Only effective if complete.hierarchy is TRUE.

project

character. The project that groups all samples. This parameter is overriden by the the project.col parameter if not NA. Only effective if complete.hierarchy is TRUE.

event.prefix

character. A prefix to feauture in the eventIDs if these have to be created from scratch. This parameter overrides the eventID.col parameter.

complete.hierarchy

logical. If TRUE, the event-parentEvent hierarchy structure will be completed upt to a root (project). Any parentEvents that are not listed among the eventIDs will also be created. Default FALSE. if parentEventID.col was NA, new parentEventIDs will be created.

Details

An event structure is a hierarchical grouping of (sub-) samples (events, that is, something that occurs at a place and time) into higher parentEvents, like expeditions, projects,... This interlinked structure is for instance used in the DarwinCore Event standard. The algorithm here looks for specific columnames that may give an indication to the event structure (e.g. a column with sampe names, projects,...). If desired, the user can complete the event structure by rooting it into an over-arching project.

Value

a dataframe with 2 columns: eventID and parentEventID

Author(s)

Maxime Sweetlove CC-0 2020

See Also

Other quality control functions: dataQC.LatitudeLongitudeCheck(), dataQC.TaxonListFromData(), dataQC.TermsCheck(), dataQC.completeTaxaNamesFromRegistery(), dataQC.dateCheck(), dataQC.findNames(), dataQC.generate.footprintWKT(), dataQC.guess.env_package.from.data(), dataQC.taxaNames()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
test_metadata <- data.frame(sample_name=paste("sample", 1:5, sep="_"),
                            eventID=paste("sample", 1:5, sep="_"),
                            row.names=paste("sample", 1:5, sep="_"))
dataQC.eventStructure(dataset=test_metadata, eventID.col = "eventID", 
                      parentEventID.col = NA, project.col = NA, 
                      project = NA, event.prefix = NA,
                      complete.hierarchy=FALSE)
dataQC.eventStructure(dataset=test_metadata, eventID.col = "eventID", 
                      parentEventID.col = NA, project.col = NA, 
                      project = "project_1", event.prefix = NA,
                      complete.hierarchy=TRUE)

biodiversity-aq/OmicsMetaData documentation built on Dec. 19, 2021, 9:44 a.m.