dataQC.findNames: find the samplenames in a dataset

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/DataQC_Utils.R

Description

find the sample names in a given (meta-)dataset where at least an attempt has been made to standardize the data following MIxS or DarwinCore

Usage

1
dataQC.findNames(dataset, ask.input=TRUE, sample.names=NA)

Arguments

dataset

data.frame. The data.frame where to look for the sample names

ask.input

logical. If TRUE, console input will be requested to the user when a problem occurs. Default TRUE

sample.names

character. The column with sample names to use. Use row.names for rownames. If NA the function will try to find sample names itself. default NA

Details

It is often not clear where the sample names are in a dataset. This function makes an educated guess, based on rownames or tags that are often used to indicate sample names. If ask.input, then the process happens user-supervised.

Value

a list of length 3 with: "$Names" a named vector with the most likely sample names, "$Names.column" the column name where the sample names were found, "$warningmessages" a vector with warning messages

Author(s)

Maxime Sweetlove CC-0 2020

See Also

Other quality control functions: dataQC.LatitudeLongitudeCheck(), dataQC.TaxonListFromData(), dataQC.TermsCheck(), dataQC.completeTaxaNamesFromRegistery(), dataQC.dateCheck(), dataQC.eventStructure(), dataQC.generate.footprintWKT(), dataQC.guess.env_package.from.data(), dataQC.taxaNames()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Not run: 
test_metadata <- data.frame(sample_name=paste("sample", 1:5, sep="_"),
                           collection_date=c("2020-09-23", 
                                             "2020", 
                                             "16 Jan. 2020", 
                                             "November 1998", 
                                             "12/01/1999"),
                           latitude=c(23, 45, -56.44, "47.5",
                                      "-88° 4\' 5\""),
                           longitude=c(24, -57, -107.55, "33.5", 
                                       "-130° 26\' 9\""),
                           row.names=paste("sample", 1:5, sep="_"))
dataQC.findNames(dataset=test_metadata, ask.input=TRUE, sample.names="sample_name")

## End(Not run)

biodiversity-aq/OmicsMetaData documentation built on Dec. 19, 2021, 9:44 a.m.