setna.sp: Set values to 'NA' for sweetpotato data.

View source: R/setna_sp.R

setna.spR Documentation

Set values to NA for sweetpotato data.

Description

Detect impossible values for sweetpotato data and set them to missing value (NA) according to some rules.

Usage

setna.sp(dfr, f = 10)

Arguments

dfr

The name of the data frame.

f

Factor for extreme values detection. See details.

Details

The data frame must use the labels (lower or upper case) listed in function check.names.sp. Consider the following groups of traits:

  • pre (traits evaluated pre-harvest): vir, vir1, vir2, alt, alt1, alt2, and vv.

  • wvn (traits evaluated with vines non-pre-harvest): vw, biom, biom.d, vw.d, fytha, fytha.aj, dmvy, dmvy.aj, bytha, bytha.aj, dmby, dmby.aj, vpp, vpsp, dmvf, dmvd, hi, shi, and dmv.

  • cnn (continuos non-negative traits): vw, crw, ncrw, trw, trw.d, biom, biom.d, cytha, cytha.aj, rytha, rytha.aj, dmry, dmry.aj, vw.d, fytha, fytha.aj, dmvy, dmvy.aj, bytha, bytha.aj, dmby, dmby.aj, nrpp, nrpsp, ncrpp, ncrpsp, ypp, ypsp, vpp, vpsp, rtyldpct, rfr, bc, tc, fe, zn, ca, and mg.

  • cpo (continuous positive traits): dmf, dmd, dmvf, dmvd, acrw, ancrw, and atrw.

  • pnn (percentage non-negative traits): ci, hi, shi, fruc, gluc, sucr, and malt.

  • ppo (percentage positive traits): dm, dmv, prot, and star.

  • dnn (discrete non-negative traits): nops, nope, noph, nopr, nocr, nonc, and tnr.

  • ctg (categorical 1 to 9 traits): vir, vir1, vir2, alt, alt1, alt2, vv, scol, fcol, fcol2, rs, rf, rtshp, damr, rspr, alcdam, wed, stspwv, milldam, fraw, suraw, straw, coof, coosu, coost, coot, and cooap.

Values are set to NA with the following rules:

  • cnn traits with negative values are set to NA.

  • cpo traits with non-positive values are set to NA.

  • pnn traits with values out of the [0, 100] interval are set to NA.

  • ppo with values out of the (0, 100] interval are set to NA.

  • dnn traits with negative and non-integer values are set to NA.

  • ctg traits with out of scale values are set to NA.

  • Beta carotene values determined by RHS color charts with values different from the possible values in the RHS color chart are set to NA.

  • Extreme low and high values are detected using the interquartile range. The rule is to detect any value out of the interval [Q_1 - f \times (m/3 + IQR); Q_3 + f \times (m/3 + IQR)] where m is the mean. By default f = 10 and if less than 10 a warning is shown. Values out of this range are set to NA.

  • If nope == 0 and there is some data for any trait, then nope is set to NA.

  • If noph == 0 and there is some data for any non-pre-harvest trait, then noph is set to NA.

  • If nopr == 0 and there is some data for any trait evaluated with roots, then nopr is set to NA.

  • If noph > 0 and nocr, nonc, crw, ncrw, and vw are all 0, then vw is set to NA.

  • If nopr > 0 and nocr, nonc, crw, and ncrw are all 0, then ncrw and nonc are both set to NA.

  • If nocr == 0 and crw > 0, then nocr is set to NA.

  • If nocr > 0 and crw == 0, then crw is set to NA.

  • If nonc == 0 and ncrw > 0, then nonc is set to NA.

  • If nonc > 0 and ncrw == 0, then ncrw is set to NA.

Value

It returns the data frame with all impossible values set to NA and a list of warnings with all the rows that have been modified.

Author(s)

Raul Eyzaguirre.

Examples

dfr <- data.frame(trw = c(2.2, 5.0, 3.6, 12, 1600, -4),
                  dm = c(21, 23, 105, 24, -3, 30),
                  tnr = c(1.3, 10, 11, NA, 2, 5),
                  scol = c(1, 0, 15, 5, 4, 7),
                  fcol.cc = c(1, 15, 12, 24, 55, 20))
setna.sp(dfr)

reyzaguirre/st4gi documentation built on April 30, 2024, 5:45 a.m.