na.tools has two sets of functions dots and underscores. The dot functions have
names with .
(na.replace, na.rm, na.). Like their inspiration from the
stats package, they operate on atomic vectors. The underscore functions have
names with _
(na_replace, na_rm, na_). They operatie on recursive structures
such as lists and table-like structures. They follow dplyr/tidyverse names and
behaviors.
na_replace_all( .tbl, .funs)
_all, _if, _at
... w/atomics used as additional arguments to .na ... w/recurs used as name=...
So .tbl %>% na_replace_all( iris, 3, ...) na_replace( iris, mean, na.rm=TRUE ) na_replace_if( iris, is.cont, mean, na.rm=TRUE ) na_replace_at( iris, ! Species, mean, na.rm=TRUE )
store replaced idxs --- like na.omit
Should the package be reorganized as follows:
na.* functions operate on only atpomic and closely follow stat packages and perhaps even re-exporting the stats packages functions if necessary.
na_* function would be higher order functions that operate on atomic and recursive objects
If we are using prototype replacement with a vector ... how does that work?
It cannot perhaps there should be a na_mreplace
function that specifically
applies
How do we perform mutate_if on types/classes. See catcont package or maybe
provide na_explicit_if
or na_replace_if
functions.
Implement slow functions with Rcpp
check the class of .na
in na_replace; warn(?)
Might there be a clever way to allow something like:
NA_explicit_ <- . %>% mean(., na.rm=TRUE)
Consider having an option for values for the na_level, e.g. options( na_explicit = "(Missing)" ) or, options( na_explicit = mean )
Explicit value might depend on the class, type (cat vs. cont), or on a
specific attribute, na_explicit
or na
of the specific vari`able.
Test if NA_explicit_
work when na.actions is imported.
alias as na_mutate, na_transmute or mutate_na, transmute_na, cf. dplyr::mutate_if,
Consider how explicit NA will be treated in sorting
Do we want exceptional values first or last?
What is the explicit value
na_ex
Functions such as na_replace
, na_explicit
, na_max
, etc. should also
set the na
attribute.
change values
argument to na
or .NA
It cannot/should not be names NA as this is a reserved word. .NA is kinda-nice
because it is most like na and is very apparent. na is also good. .na changes
too many things.
.Inf can also exist, but .0 cannot since this is a numeric
.NA, .Inf, .NegInf, .zero or na, inf, neginf, zero?
Could we do this ....
tbl %>% na_replace( col1=max, col2=min, col3=0, ... )
na_replace( x, .na=.values, ..., .values ) na_replace( x, .na=.values, ..., .values ) na_explicit( x, .na=.values, ..., .values=getOption('na_explicit.factor'))
na_replace(x, 0 ) na_replace(tbl,col= )
I can see it ... so na, inf, neginf or zero might conflict with column names. Best to prefix with .na, .Inf,
Now does it make sense to
Will there ever be a case where there might be additional arguments passed or these might get mistaken for column names?
Makes it non-parallel with inf_replace
, zero_replace
, etc. but this
might be alright as this would better allow wrapper functions to use
multiple exception handlers.
A construction like .... na_replace <- function( x, .na=values, values=getOption(...) ) will allow for the use of values with default or .na
Use catcont package
na_replace
and na_explicit
are getting very similar and should probably be
made aliases
Devise syntax of related to list-like/recursive objects
applying to each element object See na_explicit and na_implicit
[NO] Rename package nada ("nothing");
or, na.tools?
Handle R/coredata?
[ ] Make na_replace accept a function or expression as well. If an expression, should also provide an environment as well. x %>% na_replace( function(x) ifexists(x,y,z) )
What is faster ifelse or replace ifelse( is.na(x), y, x) replace( x, is.na(x), y[is.na(x)] )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.