(Multiply) complete dataset based on marginal properties of each column
Description
(Multiply) complete dataset based on marginal properties of each column
Usage
1 2  rCatsAndCntInDfr(dfr, maxFullNACatCols = 6, howManyIfTooMany = 1000, weightsName = "weights", orgriName = "orgri", reweightPerRow = FALSE, verbosity = 0, ...)
rCatsInDfr(dfr, maxFullNACatCols=6, howManyIfTooMany=1000, onlyCategorical=FALSE, weightsName="weights", orgriName="orgri", reweightPerRow=FALSE, verbosity=0,...)

Arguments
dfr 

maxFullNACatCols, howManyIfTooMany 
If a row from 
onlyCategorical 
if 
weightsName 
if not 
orgriName 
if not 
reweightPerRow 
If weights are returned, then for rows having more than 
verbosity 
The higher this value, the more levels of progress and debug information is displayed (note: in R for Windows, turn off buffered output) 
... 
Ignored for now 
Details
The 'random subset' is created by drawing the missing categorical values based
on their marginal probability in dfr
.
The continuous missing data is simply filled out with the mean.
Value
Object of the same class as dfr
. Dependent on onlyCategorical
, it
may only contain the categorical columns. For the rest it mainly has the same
structure as dfr
, though it may contain two extra columns based on
weightsName
and orgriName
.
Author(s)
Nick Sabbe (nick.sabbe@ugent.be)
See Also
GLoMopackage
, NumDfr
Examples
1 2 3  iris.md<randomNA(iris, 0.1)
iris.md.nd<numdfr(iris.md)
iris.nd.rnd<rCatsAndCntInDfr(iris.md.nd, orgriName=NULL, verbosity=1)
