Home

/

GitHub

/

In tnagler/jdify: Joint density classification

knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>",
    fig.path = "inst/README-"
)
set.seed(5)

jdify

jdify is an R package implementing classifiers based on the joint density of the predictors and the class variable. Several methods for joint density estimation can be used.

To install, open R and type

devtools::install_github("tnagler/jdify")

Functionality

The core functionality is illustrated below and in this code snippet. For a detailed description of all functions and their arguments, see the API documentation.

Classification modeling

The core function in this package is jdify() which builds a classification model for a given data set. It estimates the joint density of the predictors and the class variable and derives conditional class probabilities from it.

library(jdify)

dat <- data.frame(
    cl = as.factor(rbinom(10, 1, 0.5)),
    x1 = rnorm(10),
    x2 = ordered(rbinom(10, 5, 0.3))
)
model <- jdify(cl ~ x1 + x2, data = dat, jd_method = "cctools")
probs <- predict(model, dat, what = "probs")  # conditional probabilities

jdify() can handle discrete predictors. They have to be declared as ordered or factor (for unordered categorical variables). All other variables are treated as continuous.

Methods for joint density estimation

You can choose from three built-in methods for: "cctools" (default), "kdevine", "np". The method name indicates the package that is used for joint density estimation.

You can also create custom functions for density estimation by jd_method(). The following is another implementation of the method "kdevine".

my_fit <- function(x, ...)
   kdevine::kdevine(x, ...)
my_eval <- function(object, newdata, ...)
   kdevine::dkdevine(newdata, object)
my_method <- jd_method(fit_fun = my_fit, eval_fun = my_eval, cc = TRUE)
model <- jdify(cl ~ x1 + x2, data = dat, jd_method = my_method)

The option cc = TRUE indicates that the method does not naturally handle discrete data. In this case, jdify automatically invokes the continuous convolution trick (see, Nagler, 2017).

Cross validation and performance assessment

cv_jdify() is a convenience function that does k-fold cross validation for you. It splits the data, fits joint density models and evaluates the conditional class probabilities on the hold-out samples.

cv <- cv_jdify(cl ~ x1 + x2, data = dat, folds = 3)
cv$cv_probs

The function assess_clsfyr() allows to calculate several performance measures from the conditional class probabilities. Its first argument is the probability of the class, the second is a class indicator.

assess_clsfyr(cv$cv_probs[, 1], dat[, 1] == 0, measure = c("ACC", "F1"))

References

Nagler, T. (2017). A generic approach to nonparametric function estimation with mixed data. arXiv:1704.07457

tnagler/jdify documentation built on May 31, 2019, 4:41 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

tnagler/jdify
Joint density classification

In tnagler/jdify: Joint density classification

jdify

Functionality

Classification modeling

Methods for joint density estimation

Cross validation and performance assessment

References

R Package Documentation

Browse R Packages

We want your feedback!

tnagler/jdify Joint density classification

In tnagler/jdify: Joint density classification

jdify

Functionality

Classification modeling

Methods for joint density estimation

Cross validation and performance assessment

References

R Package Documentation

Browse R Packages

We want your feedback!

tnagler/jdify
Joint density classification