OmicsMetaData: OmicsMetaData: A package with tools to format and standardize...

Description Classes Methods for classes Libraries general functions Quality Control functions upload-download omics data functions

Description

This package provides 5 cathegories of tools, these are: formating functions, standardization functions, quality control functions, data archiving functions and downloading data functions In addition, there are also different libraries with terms of the MIxS and DarwinCore standards, term variants, synonyms and translations.

Classes

MIxS.metadata – data formated in the MIxS standard DwC.event – data formated in DarwinCore (DwC) with event core DwC.occurrence – data formated in DwC with occurrence core

Methods for classes

write.MIxS – write MIxS.metadata to a CSV file check.valid.metadata.DwC – validator function for the DwC.event and DwC.occurrence classes check.valid.MIxS.metadata – validator function for the MIxS.metadata class

Libraries

TermsLib – central library with mapped terms of DwC, MIxS and miscellaneous missing terms TermsSyn – library with synonyms for standard terms TermsSyn_DwC – library with synonyms for DwC terms TaxIDLib – non-exaustive library with some common INSDC taxon IDs ENA_allowed_terms – terms accepted by ENA-EMBL ENA_checklistAccession – checklist accessions accepted by ENA-EMBL ENA_geoloc – geographic locations names accepted by ENA-EMBL ENA_instrument – instrument names accepted by ENA-EMBL

general functions

term.definition – get the definition of a term combine.data – combine data combine.data.frame – combine different dataframes commonTax.to.NCBI.TaxID – get the NCBI taxID of a taxon coordinate.to.decimal – convert any coordinate to decimal coordinates eMoF.to.wideTable – convert DwC eMoF (long format) to a wide table wideTable.to.eMoF – convert wide table to a long table formatted as eMoF find.dataset – find a column in a dataframe get.boundingBox – make a bounding box from coordinates multi.warnings – collect warning messages while a function runs

Quality Control functions

dataQC.completeTaxaNamesFromRegistery – get taxonomic information from WoRMS dataQC.dateCheck – standardize dates to ISO dataQC.DwC – automated check converting into DwC dataQC.DwC_general – automated check converting into DwC dataQC.eventStructure – create an hierarchical event structure dataQC.findNames – detect sample names in a file dataQC.generate.footprintWKT – generate a WKT from coordinates dataQC.guess.env_package.from.data – guess the MIxS environmental package dataQC.LatitudeLongitudeCheck – automated check for coordinates dataQC.MIxS – automated check converting into MIxS dataQC.taxaNames – clean out taxonomic names dataQC.TaxonListFromData – find taxonomic names for samples dataQC.TermsCheck – map a set of strings to standardized accepted terms

upload-download omics data functions

download.sequences.INSDC – download sequences from INSDC to R FileNames.to.Table – get file names from a folder get.insertSize – get the sequence lengths from a file get.BioProject.metadata.INSDC – get metadata from INSDC (no API account required) get.ENAName – get the ENA-EMBL variant of a MIxS term get.sample.attributes.INSDC – get full list of metadata from INSDC (API account required) prep.metadata.ENA – format metadata compliant to ENA-EMBL requirements rename.sequenceFiles – automated renaming sequence files sync.metadata.sequenceFiles – check is samples in R correspond to sequence files


biodiversity-aq/OmicsMetaData documentation built on Dec. 19, 2021, 9:44 a.m.