setna.sp | R Documentation |
NA
for sweetpotato data.Detect impossible values for sweetpotato data and set them to missing value
(NA
) according to some rules.
setna.sp(dfr, f = 10)
dfr |
The name of the data frame. |
f |
Factor for extreme values detection. See details. |
The data frame must use the labels (lower or upper case) listed in
function check.names.sp
.
Consider the following groups of traits:
pre
(traits evaluated pre-harvest): vir
, vir1
,
vir2
, alt
, alt1
, alt2
, and vv
.
wvn
(traits evaluated with vines non-pre-harvest): vw
,
biom
, biom.d
, vw.d
, fytha
, fytha.aj
,
dmvy
, dmvy.aj
, bytha
, bytha.aj
, dmby
,
dmby.aj
, vpp
, vpsp
, dmvf
, dmvd
,
hi
, shi
, and dmv
.
cnn
(continuos non-negative traits): vw
, crw
,
ncrw
, trw
, trw.d
, biom
, biom.d
,
cytha
, cytha.aj
, rytha
, rytha.aj
, dmry
,
dmry.aj
, vw.d
, fytha
, fytha.aj
, dmvy
,
dmvy.aj
, bytha
, bytha.aj
, dmby
, dmby.aj
,
nrpp
, nrpsp
, ncrpp
, ncrpsp
, ypp
,
ypsp
, vpp
, vpsp
, rtyldpct
, rfr
,
bc
, tc
, fe
, zn
, ca
, and mg
.
cpo
(continuous positive traits): dmf
, dmd
,
dmvf
, dmvd
, acrw
, ancrw
, and atrw
.
pnn
(percentage non-negative traits): ci
, hi
,
shi
, fruc
, gluc
, sucr
, and malt
.
ppo
(percentage positive traits): dm
, dmv
,
prot
, and star
.
dnn
(discrete non-negative traits): nops
, nope
,
noph
, nopr
, nocr
, nonc
, and tnr
.
ctg
(categorical 1 to 9 traits): vir
, vir1
,
vir2
, alt
, alt1
, alt2
, vv
, scol
,
fcol
, fcol2
, rs
, rf
, rtshp
, damr
,
rspr
, alcdam
, wed
, stspwv
, milldam
,
fraw
, suraw
, straw
, coof
, coosu
,
coost
, coot
, and cooap
.
Values are set to NA
with the following rules:
cnn
traits with negative values are set to NA
.
cpo
traits with non-positive values are set to NA
.
pnn
traits with values out of the [0, 100] interval are set to NA
.
ppo
with values out of the (0, 100] interval are set to NA
.
dnn
traits with negative and non-integer values are set to NA
.
ctg
traits with out of scale values are set to NA
.
Beta carotene values determined by RHS color charts with values different from
the possible values in the RHS color chart are set to NA
.
Extreme low and high values are detected using the interquartile range.
The rule is to detect any value out of the interval
[Q_1 - f \times (m/3 + IQR); Q_3 + f \times (m/3 + IQR)]
where m
is the mean. By default f = 10
and if less than 10 a warning is shown.
Values out of this range are set to NA
.
If nope == 0
and there is some data for any trait,
then nope
is set to NA
.
If noph == 0
and there is some data for any non-pre-harvest trait,
then noph
is set to NA
.
If nopr == 0
and there is some data for any trait evaluated with roots,
then nopr
is set to NA
.
If noph > 0
and nocr
, nonc
, crw
, ncrw
,
and vw
are all 0, then vw
is set to NA
.
If nopr > 0
and nocr
, nonc
, crw
, and ncrw
are all 0, then ncrw
and nonc
are both set to NA
.
If nocr == 0
and crw > 0
, then nocr
is set to NA
.
If nocr > 0
and crw == 0
, then crw
is set to NA
.
If nonc == 0
and ncrw > 0
, then nonc
is set to NA
.
If nonc > 0
and ncrw == 0
, then ncrw
is set to NA
.
It returns the data frame with all impossible values set to NA
and a list of warnings with all the rows that have been modified.
Raul Eyzaguirre.
dfr <- data.frame(trw = c(2.2, 5.0, 3.6, 12, 1600, -4),
dm = c(21, 23, 105, 24, -3, 30),
tnr = c(1.3, 10, 11, NA, 2, 5),
scol = c(1, 0, 15, 5, 4, 7),
fcol.cc = c(1, 15, 12, 24, 55, 20))
setna.sp(dfr)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.