vtreat: A Statistically Sound 'data.frame' Processor/Conditioner

A 'data.frame' processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. 'vtreat' prepares variables so that data has fewer exceptional cases, making it easier to safely use models in production. Common problems 'vtreat' defends against: 'Inf', 'NA', too many categorical levels, rare categorical levels, and new categorical levels (levels seen during application, but not during training). Reference: "'vtreat': a data.frame Processor for Predictive Modeling", 'Zumel', 'Mount', 2016, DOI:10.5281/zenodo.1173314.

Package details

AuthorJohn Mount [aut, cre], Nina Zumel [aut], Win-Vector LLC [cph]
MaintainerJohn Mount <[email protected]>
URL https://github.com/WinVector/vtreat/ https://winvector.github.io/vtreat/
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:

Try the vtreat package in your browser

Any scripts or data that you put into this service are public.

vtreat documentation built on Jan. 3, 2019, 1:06 a.m.