vtreat: A Statistically Sound 'data.frame' Processor/Conditioner
Version 0.5.31

A 'data.frame' processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. 'vtreat' prepares variables so that data has fewer exceptional cases, making it easier to safely use models in production. Common problems 'vtreat' defends against: 'Inf', 'NA', too many categorical levels, rare categorical levels, and new categorical levels (levels seen during application, but not during training). 'vtreat::prepare' should be used as you would use 'model.matrix'.

AuthorJohn Mount, Nina Zumel
Date of publication2017-04-14 13:59:28 UTC
MaintainerJohn Mount <jmount@win-vector.com>
LicenseGPL-3
Version0.5.31
URL https://github.com/WinVector/vtreat
Package repositoryView on CRAN
InstallationInstall the latest version of this package by entering the following in R:
install.packages("vtreat")

Getting started

README.md
Grouping Example
Saving Treatment Plans
Variable Types
vtreat cross frames
vtreat overfit
vtreat package
vtreat Rare Levels
vtreat scale mode
vtreat significance
vtreat splitting

Popular man pages

designTreatmentsC: Build all treatments for a data frame to predict a...
designTreatmentsZ: Design variable treatments with no outcome variable.
getSplitPlanAppLabels: read application labels off a split plan.
kWayStratifiedY: k-fold cross validation stratified on y, a splitFunction in...
linScore: Return in-sample linear stats and scaling.
oneWayHoldout: One way holdout, a splitFunction in the sense of...
vnames: New treated variable names from a treatmentplan$treatment...
See all...

All man pages Function index File listing

Man pages

buildEvalSets: Build set carve-up for out-of sample evaluation.
catScore: return significnace 1 variable logistic regression
designTreatmentsC: Build all treatments for a data frame to predict a...
designTreatmentsN: build all treatments for a data frame to predict a numeric...
designTreatmentsZ: Design variable treatments with no outcome variable.
format.vtreatment: Display treatment plan.
getSplitPlanAppLabels: read application labels off a split plan.
kWayCrossValidation: k-fold cross validation, a splitFunction in the sense of...
kWayStratifiedY: k-fold cross validation stratified on y, a splitFunction in...
linScore: Return in-sample linear stats and scaling.
makekWayCrossValidationGroupedByColumn: Build a k-fold cross validation splitter, respecting (never...
mkCrossFrameCExperiment: Run categorical cross-frame experiment.
mkCrossFrameNExperiment: Run numeric cross frame experiment.
oneWayHoldout: One way holdout, a splitFunction in the sense of...
prepare: Apply treatments and restrict to useful variables.
print.vtreatment: Print treatmentplan.
problemAppPlan: check if appPlan is a good carve-up of 1:nRows into nSplits...
vnames: New treated variable names from a treatmentplan$treatment...
vorig: Original variable name from a treatmentplan$treatment item.
vtreat: vtreat: A Statistically Sound 'data.frame'...

Functions

Files

inst
inst/doc
inst/doc/vtreatSplitting.Rmd
inst/doc/vtreatGrouping.Rmd
inst/doc/SavingTreamentPlans.html
inst/doc/vtreatCrossFrames.R
inst/doc/vtreat.html
inst/doc/vtreat.Rmd
inst/doc/vtreatRareLevels.html
inst/doc/vtreatSignificance.html
inst/doc/vtreatOverfit.R
inst/doc/vtreatRareLevels.R
inst/doc/vtreatVariableTypes.html
inst/doc/vtreatScaleMode.R
inst/doc/vtreatScaleMode.Rmd
inst/doc/vtreatVariableTypes.Rmd
inst/doc/vtreatCrossFrames.Rmd
inst/doc/SavingTreamentPlans.Rmd
inst/doc/vtreatOverfit.Rmd
inst/doc/vtreatScaleMode.html
inst/doc/vtreatGrouping.R
inst/doc/vtreatSplitting.html
inst/doc/vtreatGrouping.html
inst/doc/vtreatCrossFrames.html
inst/doc/vtreat.R
inst/doc/SavingTreamentPlans.R
inst/doc/vtreatSignificance.Rmd
inst/doc/vtreatSplitting.R
inst/doc/vtreatOverfit.html
inst/doc/vtreatRareLevels.Rmd
inst/doc/vtreatVariableTypes.R
inst/doc/vtreatSignificance.R
tests
tests/testthat.R
tests/testthat
tests/testthat/testWeirdTypes.R
tests/testthat/testW1.R
tests/testthat/testdplyr.R
tests/testthat/testParallel.R
tests/testthat/testExpmtDesign.R
tests/testthat/testNoY.R
tests/testthat/testPC.R
tests/testthat/testSig.R
tests/testthat/uci.car.data.Rdata
tests/testthat/testBO.R
tests/testthat/testCar.R
tests/testthat/testScale.R
tests/testthat/testDataTable.R
tests/testthat/testStability.R
tests/testthat/testZW.R
tests/testthat/testUniqValue.R
NAMESPACE
R
R/utils.R
R/deviationFact.R
R/prevalenceFact.R
R/vtreatImpl.R
R/outOfSample.R
R/indicatorTreatment.R
R/effectTreatmentN.R
R/vtreat.R
R/cleanTreatment.R
R/isBadTreatment.R
R/effectTreatmentC.R
vignettes
vignettes/vtreatSplitting.Rmd
vignettes/vtreatGrouping.Rmd
vignettes/vtreat.Rmd
vignettes/vtreatScaleMode.Rmd
vignettes/vtreatVariableTypes.Rmd
vignettes/vtreatCrossFrames.Rmd
vignettes/SavingTreamentPlans.Rmd
vignettes/vtreatOverfit.Rmd
vignettes/superX.png
vignettes/vtreatSignificance.Rmd
vignettes/vtreatX.png
vignettes/vtreatRareLevels.Rmd
README.md
MD5
build
build/vignette.rds
DESCRIPTION
man
man/designTreatmentsC.Rd
man/problemAppPlan.Rd
man/catScore.Rd
man/designTreatmentsZ.Rd
man/vtreat.Rd
man/mkCrossFrameCExperiment.Rd
man/makekWayCrossValidationGroupedByColumn.Rd
man/format.vtreatment.Rd
man/buildEvalSets.Rd
man/print.vtreatment.Rd
man/designTreatmentsN.Rd
man/vorig.Rd
man/prepare.Rd
man/getSplitPlanAppLabels.Rd
man/oneWayHoldout.Rd
man/kWayStratifiedY.Rd
man/linScore.Rd
man/vnames.Rd
man/kWayCrossValidation.Rd
man/mkCrossFrameNExperiment.Rd
tools
tools/vtreat.png
vtreat documentation built on May 19, 2017, 8:47 p.m.

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs in the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.