dprep: Data Pre-Processing and Visualization Functions for Classification
Version 3.0.2

Data preprocessing techniques for classification. Functions for normalization, handling of missing values,discretization, outlier detection, feature selection, and data visualization are included.

AuthorEdgar Acuna and the CASTLE research group at The University of Puerto Rico-Mayaguez
Date of publication2015-11-24 07:46:38
MaintainerEdgar Acuna <edgar.acuna@upr.edu>
LicenseGPL
Version3.0.2
Package repositoryView on CRAN
InstallationInstall the latest version of this package by entering the following in R:
install.packages("dprep")

Getting started

Package overview

Popular man pages

decscale: Decimal Scaling
diabetes: The Pima Indian Diabetes dataset
disc.1r: Discretization using the Holte's 1R method
discretevar: Performs Minimum Entropy discretization for a given attribute
hepatitis: The hepatitis dataset
knngow: K-nn classification using Gower distance
srbct: Khan et al.'s small round blood cells dataset
See all...

All man pages Function index File listing

Man pages

acugow: Gower distance from a vector to a matrix
arboleje: Predicting a bank's decision to give a loan for buying a car.
arboleje1: Predicting a bank's decision to give a loan for buying a car.
autompg: The Auto MPG dataset
baysout: Outlier detection using Bay and Schwabacher's algorithm.
breastw: The Breast Wisconsin dataset
bupa: The Bupa dataset
ce.impute: Imputation in supervised classification
ce.mimp: Mean or median imputation
census: census
chiMerge: Discretization using the Chi-Merge method
circledraw: circledraw
clean: Dataset's cleaning
colon: Alon et al.'s colon dataset
combinations: Constructing distinct permutations
crossval: Cross validation estimation of the misclassification error
crx: crx
cv10knn2: Auxiliary function for sequential feature selection
cv10lda2: Auxiliary function for sequential forward selection
cv10log: 10-fold cross validation estimation error for the classifier...
cv10mlp: 10-fold cross validation error estimation for the multilayer...
cv10rpart2: Auxiliary function for sequential feature selection
cvnaiveBayesd: Crossvalidation estimation error for the naive Bayes...
decscale: Decimal Scaling
diabetes: The Pima Indian Diabetes dataset
disc.1r: Discretization using the Holte's 1R method
disc2: Auxiliary function for performing discretization using equal...
disc.ef: Discretization using the method of equal frequencies
disc.ew: Discretization using the equal width method
disc.mentr: Discretization using the minimum entropy criterion
discretevar: Performs Minimum Entropy discretization for a given attribute
distancia: Vector-Vector Euclidiean Distance Function
distancia1: Vector-Vector Manhattan Distance Function
dist.to.knn: Auxiliary function for the LOF algorithm.
dprep-package: Data Preprocessing for supervised classification
ec.knnimp: Imputation using k-nearest neighbors.
eje1dis: Basic example for discriminant analysis
finco: FINCO Feature Selection Algorithm
heartc: The Heart Cleveland dataset
hepatitis: The hepatitis dataset
imagmiss: Visualization of Missing Data
inconsist: Computing the inconsistency measure
ionosphere: The Ionosphere dataset
knneigh.vect: Auxiliary function for computing the LOF measure.
knngow: K-nn classification using Gower distance
landsat: The landsat Satellite dataset
lofactor: Local Outlier Factor
lvf: Las Vegas Filter
mahaout: Multivariate outlier detection through the boxplot of the...
mardia: The Mardia's test of normality
maxlof: Detection of multivariate outliers using the LOF algorithm
midpoints1: Auxiliary function for computing minimun entropy...
mmnorm: Min-max normalization
mo3: The third moment of a multivariate distribution
mo4: The fourth moment of a multivariate distribution
moda: Calculating the Mode
near1: Auxiliary function for the reliefcont function
near3: Auxiliary function for the reliefcat function
nnmiss: Auxiliary function for knn imputation
outbox: Detecting outliers through boxplots of the features.
parallelplot: Parallel Coordinate Plot
radviz2d: Radial Coordinate Visualization
rangenorm: range normalization
reachability: Function for computing the reachability measure in the LOF...
redundancy: Finding the unique observations in a dataset along with their...
relief: RELIEF Feature Selection
reliefcat: Feature selection by the Relief Algorithm for datasets...
reliefcont: Feature selection by the Relief Algorithm for datasets with...
robout: Outlier Detection with Robust Mahalonobis distance
row.matches: Finding rows in a matrix equal to a given vector
sbs1: One-step sequential backward selection
score: Score function used in Bay's algorithm for outlier detection
sffs: Sequential Floating Forward Method
sfs: Sequential Forward Selection
sfs1: One-step sequential forward selection
Shuttle: The Shuttle dataset
signorm: Sigmoidal Normalization
softmaxnorm: Softmax Normalization
sonar: The Sonar dataset
srbct: Khan et al.'s small round blood cells dataset
star3d: Data Visuaization using star coordinates in three dimensions
starcoord: The star coordinates plot
surveyplot: Surveyplot
tchisq: Auxiliary function for the Chi-Merge discretization
top: Auxiliary function for Bay's Ouylier Detection Algorithm
unor: Auxiliary function for performing Holte's 1R discretization
vehicle: The Vehicle dataset
vvalen: The Van Valen test for equal covariance matrices
vvalen1: Auxiliary function for computing the Van Valen's...
znorm: Z-score normalization

Functions

Files

src
src/Discrete.cpp
NAMESPACE
data
data/Shuttle.rda
data/eje1dis.rda
data/crx.rda
data/hepatitis.rda
data/bupa.rda
data/ionosphere.rda
data/sonar.rda
data/vehicle.rda
data/srbct.rda
data/heartc.rda
data/arboleje.rda
data/datalist
data/landsat.rda
data/breastw.rda
data/diabetes.rda
data/autompg.rda
data/arboleje1.rda
data/census.rda
data/colon.rda
R
R/crossval.R
R/radviz2d.R
R/maxlof.R
R/knneigh.vect.R
R/inconsist.R
R/reachability.R
R/decscale.R
R/redundancy.R
R/parallelplot.R
R/chiMerge.R
R/disc2.R
R/softmaxnorm.R
R/mardia.R
R/clean.R
R/mmnorm.R
R/cv10rpart2.R
R/dist.to.knn.R
R/nnmiss.R
R/sfs1.R
R/ec.knnimp.R
R/tchisq.R
R/surveyplot.R
R/score.R
R/finco.R
R/reliefcont.R
R/sbs1.R
R/cv10mlp.R
R/distancia1.R
R/star3d.R
R/ce.mimp.R
R/disc.ef.R
R/reliefcat.R
R/sffs.R
R/combinations.R
R/mo4.R
R/vvalen.R
R/rangenorm.R
R/disc.ew.R
R/starcoord.R
R/lvf.R
R/baysout.R
R/unor.R
R/signorm.R
R/near3.R
R/knngow.R
R/top.R
R/cv10log.R
R/znorm.R
R/near1.R
R/mahaout.R
R/ce.impute.R
R/sfs.R
R/cvnaiveBayesd.R
R/outbox.R
R/acugow.R
R/moda.R
R/relief.R
R/mo3.R
R/disc.mentr.R
R/circledraw.R
R/disc.1r.R
R/imagmiss.R
R/robout.R
R/vvalen1.R
R/row.matches.R
R/midpoints1.R
R/distancia.R
R/lofactor.R
R/cv10knn2.R
R/cv10lda2.R
R/discretevar.R
MD5
DESCRIPTION
man
man/ec.knnimp.Rd
man/srbct.Rd
man/knngow.Rd
man/reliefcat.Rd
man/robout.Rd
man/colon.Rd
man/chiMerge.Rd
man/maxlof.Rd
man/outbox.Rd
man/baysout.Rd
man/softmaxnorm.Rd
man/score.Rd
man/sbs1.Rd
man/row.matches.Rd
man/cv10lda2.Rd
man/circledraw.Rd
man/clean.Rd
man/star3d.Rd
man/hepatitis.Rd
man/distancia.Rd
man/ionosphere.Rd
man/disc.1r.Rd
man/reachability.Rd
man/crx.Rd
man/cv10knn2.Rd
man/mahaout.Rd
man/mmnorm.Rd
man/sfs.Rd
man/lofactor.Rd
man/combinations.Rd
man/cv10mlp.Rd
man/distancia1.Rd
man/disc.ew.Rd
man/near1.Rd
man/ce.impute.Rd
man/mo3.Rd
man/unor.Rd
man/top.Rd
man/disc.mentr.Rd
man/discretevar.Rd
man/redundancy.Rd
man/starcoord.Rd
man/autompg.Rd
man/vvalen.Rd
man/reliefcont.Rd
man/cv10log.Rd
man/vehicle.Rd
man/znorm.Rd
man/tchisq.Rd
man/disc2.Rd
man/finco.Rd
man/sonar.Rd
man/dist.to.knn.Rd
man/ce.mimp.Rd
man/disc.ef.Rd
man/knneigh.vect.Rd
man/census.Rd
man/relief.Rd
man/midpoints1.Rd
man/inconsist.Rd
man/landsat.Rd
man/moda.Rd
man/parallelplot.Rd
man/dprep-package.Rd
man/mo4.Rd
man/signorm.Rd
man/imagmiss.Rd
man/eje1dis.Rd
man/breastw.Rd
man/cvnaiveBayesd.Rd
man/sffs.Rd
man/diabetes.Rd
man/sfs1.Rd
man/arboleje.Rd
man/crossval.Rd
man/vvalen1.Rd
man/bupa.Rd
man/radviz2d.Rd
man/lvf.Rd
man/decscale.Rd
man/mardia.Rd
man/acugow.Rd
man/Shuttle.Rd
man/rangenorm.Rd
man/near3.Rd
man/surveyplot.Rd
man/nnmiss.Rd
man/heartc.Rd
man/cv10rpart2.Rd
man/arboleje1.Rd
dprep documentation built on May 19, 2017, 8:13 a.m.

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs in the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.