sanitize_vft: Fix common problems in _ViewFullTable_ and _ViewTaxonomy_...

sanitize_vftR Documentation

Fix common problems in ViewFullTable and ViewTaxonomy data.

Description

These functions fix common problems of ViewFullTable and ViewTaxonomy data:

  • Ensure that each column has the correct type.

  • Ensure that missing values are represented with NAs – not with the literal string "NULL".

Usage

sanitize_vft(.data, na = c("", "NA", "NULL"), ...)

sanitize_taxa(.data, na = c("", "NA", "NULL"), ...)

Arguments

.data

A dataframe; either a ForestGEO ViewFullTable (sanitize_vft()). or ViewTaxonomy (sanitize_vft()).

na

Character vector of strings to interpret as missing values. Set this option to character() to indicate no missing values.

...

Arguments passed to readr::type_convert().

Value

A dataframe.

Acknowledgments

Thanks to Shameema Jafferjee Esufali for motivating this functions.

See Also

read_vft().

Examples

assert_is_installed("fgeo.x")

vft <- fgeo.x::vft_4quad

# Introduce problems to show how to fix them
# Bad column types
vft[] <- lapply(vft, as.character)
# Bad representation of missing values
vft$PlotName <- "NULL"

# "NULL" should be replaced by `NA` and `DBH` should be numeric
str(vft[c("PlotName", "DBH")])

# Fix
vft_sane <- sanitize_vft(vft)
str(vft_sane[c("PlotName", "DBH")])

taxa <- read.csv(fgeo.x::example_path("taxa.csv"))
# E.g. inserting bad column types
taxa[] <- lapply(taxa, as.character)
# E.g. inserting bad representation of missing values
taxa$SubspeciesID <- "NULL"

# "NULL" should be replaced by `NA` and `ViewID` should be integer
str(taxa[c("SubspeciesID", "ViewID")])

# Fix
taxa_sane <- sanitize_taxa(taxa)
str(taxa_sane[c("SubspeciesID", "ViewID")])

forestgeo/fgeo.utils documentation built on Sept. 12, 2022, 6:12 p.m.