preproviz: Tools for Visualization of Interdependent Data Quality Issues

Data quality issues such as missing values and outliers are often interdependent, which makes preprocessing both time-consuming and leads to suboptimal performance in knowledge discovery tasks. This package supports preprocessing decision making by visualizing interdependent data quality issues through means of feature construction. The user can define his own application domain specific constructed features that express the quality of a data point such as number of missing values in the point or use nine default features. The outcome can be explored with plot methods and the feature constructed data acquired with get methods.

Install the latest version of this package by entering the following in R:
AuthorMarkus Vattulainen [aut, cre]
Date of publication2016-07-09 10:10:07
MaintainerMarkus Vattulainen <>

View on CRAN

Man pages

AnalysisClass-class: An S4 class representing analysis data

BaseClass-class: An abstract S4 class representing contructed features

computeValue: generic function for computing constructed feature vectors

constructfeature: constructor function for adding constructed features to the...

ControlClass-class: An S4 class representing setups to be executed

DataClass-class: An S4 class representing data objects

defaultParameters: defaultParameters

getbasedata: getbasedata

getclasslabels: getclasslabels

getcmdsdata: get classical multidimensional scaling from minmaxconstructed...

getcombineddata: get basedata and constructed data combined

getconstructeddata: getconstructeddata

getlofscores: getlofscores

getlofsumdata: getlofsumdata

getlongformatconstructeddata: get constructed data in long format

getlongformatminmaxconstructeddata: getlongformatminmaxconstructeddata

getminmaxconstructeddata: get contructed data that have been min-max normalized

getname: get name of an object

getnumericbasedata: getnumericbasedata

getnumericombineddata: get numeric columns of combined data

getparameters: getparameters

getvariableimportancedata: get random forest variable importance data

initializecontrolclassobject: constructor function for intializing a ControlClass object

initializedataobject: constructor function for initializing a DataClass object

initializeparameterclassobject: constructor function for intializing a ParameterClass objects

initializesetupclassobject: constructor function for initializing a SetUpClass object

ParameterClass-class: An S4 class representing selected constructed features

plotCMDS: generic function for plotting classical multidimensional...

plotDENSITY: generic function for plotting density estimates of...

plotHEATMAP: generic function for plotting heatmap

plotLOFSUM: generic function for plotting lof sum of constructed features

plotOUTLIERS: generic function for plotting density of LOF scores

plotVARCLUST: generic function for plotting variable clusters

plotVARIMP: generic function for plotting variable importance

preproviz: the MAIN execution function

ReportClass-class: An S4 class representing visualizations

RunClass-class: An S4 class representing preproviz output (data and...

SetUpClass-class: An S4 class representing setups


AnalysisClass-class Man page
BaseClass-class Man page
computeValue Man page
constructfeature Man page
ControlClass-class Man page
DataClass-class Man page
defaultParameters Man page
getbasedata Man page
getbasedata,AnalysisClass-method Man page
getbasedata,RunClass-method Man page
getclasslabels Man page
getclasslabels,AnalysisClass-method Man page
getclasslabels,RunClass-method Man page
getcmdsdata Man page
getcmdsdata,AnalysisClass-method Man page
getcombineddata Man page
getcombineddata,AnalysisClass-method Man page
getcombineddata,RunClass-method Man page
getconstructeddata Man page
getconstructeddata,AnalysisClass-method Man page
getconstructeddata,RunClass-method Man page
getlofscores Man page
getlofscores,AnalysisClass-method Man page
getlofsumdata Man page
getlofsumdata,AnalysisClass-method Man page
getlongformatconstructeddata Man page
getlongformatconstructeddata,AnalysisClass-method Man page
getlongformatminmaxconstructeddata Man page
getlongformatminmaxconstructeddata,AnalysisClass-method Man page
getminmaxconstructeddata Man page
getminmaxconstructeddata,AnalysisClass-method Man page
getminmaxconstructeddata,RunClass-method Man page
getname Man page
getname,AnalysisClass-method Man page
getname,BaseClass-method Man page
getnumericbasedata Man page
getnumericbasedata,AnalysisClass-method Man page
getnumericombineddata Man page
getnumericombineddata,AnalysisClass-method Man page
getnumericombineddata,RunClass-method Man page
getparameters Man page
getparameters,ParameterClass-method Man page
getparameters,SetUpClass-method Man page
getvariableimportancedata Man page
getvariableimportancedata,AnalysisClass-method Man page
initializecontrolclassobject Man page
initializedataobject Man page
initializeparameterclassobject Man page
initializesetupclassobject Man page
ParameterClass-class Man page
plotCMDS Man page
plotCMDS,ReportClass-method Man page
plotCMDS,RunClass-method Man page
plotDENSITY Man page
plotDENSITY,ReportClass-method Man page
plotDENSITY,RunClass-method Man page
plotHEATMAP Man page
plotHEATMAP,ReportClass-method Man page
plotHEATMAP,RunClass-method Man page
plotLOFSUM Man page
plotLOFSUM,ReportClass-method Man page
plotLOFSUM,RunClass-method Man page
plotOUTLIERS Man page
plotOUTLIERS,ReportClass-method Man page
plotOUTLIERS,RunClass-method Man page
plotVARCLUST Man page
plotVARCLUST,ReportClass-method Man page
plotVARCLUST,RunClass-method Man page
plotVARIMP Man page
plotVARIMP,ReportClass-method Man page
plotVARIMP,RunClass-method Man page
preproviz Man page
ReportClass-class Man page
RunClass-class Man page
SetUpClass-class Man page

Questions? Problems? Suggestions? or email at

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.