kde.test: Kernel density based global two-sample comparison test
In ks: Kernel Smoothing

kde.test

R Documentation

Kernel density based global two-sample comparison test

Description

Kernel density based global two-sample comparison test for 1- to 6-dimensional data.

Usage

kde.test(x1, x2, H1, H2, h1, h2, psi1, psi2, var.fhat1, var.fhat2, 
    binned=FALSE, bgridsize, verbose=FALSE)

Arguments

`x1`, `x2`	vector/matrix of data values
`H1`, `H2`, `h1`, `h2`	bandwidth matrices/scalar bandwidths. If these are missing, `Hpi.kfe`, `hpi.kfe` is called by default.
`psi1`, `psi2`	zero-th order kernel functional estimates
`var.fhat1`, `var.fhat2`	sample variance of KDE estimates evaluated at x1, x2
`binned`	flag for binned estimation. Default is FALSE.
`bgridsize`	vector of binning grid sizes
`verbose`	flag to print out progress information. Default is FALSE.

Details

The null hypothesis is H_0: f_1 \equiv f_2 where f_1, f_2 are the respective density functions. The measure of discrepancy is the integrated squared error (ISE) T = \int [f_1(\bold{x}) - f_2(\bold{x})]^2 \, d \bold{x}. If we rewrite this as T = \psi_{0,1} - \psi_{0,12} - \psi_{0,21} + \psi_{0,2} where \psi_{0,uv} = \int f_u (\bold{x}) f_v (\bold{x}) \, d \bold{x}, then we can use kernel functional estimators. This test statistic has a null distribution which is asymptotically normal, so no bootstrap resampling is required to compute an approximate p-value.

If H1,H2 are missing then the plug-in selector Hpi.kfe is automatically called by kde.test to estimate the functionals with kfe(deriv.order=0). Likewise for missing h1,h2.

For ks \geq 1.8.8, kde.test(binned=TRUE) invokes binned estimation for the computation of the bandwidth selectors, and not the test statistic and p-value.

Value

A kernel two-sample global significance test is a list with fields:

`Tstat`	T statistic
`zstat`	z statistic - normalised version of Tstat
`pvalue`	`p`-value of the double sided test
`mean`, `var`	mean and variance of null distribution
`var.fhat1`, `var.fhat2`	sample variances of KDE values evaluated at data points
`n1`, `n2`	sample sizes
`H1`, `H2`	bandwidth matrices
`psi1`, `psi12`, `psi21`, `psi2`	kernel functional estimates

References

Duong, T., Goud, B. & Schauer, K. (2012) Closed-form density-based framework for automatic detection of cellular morphology changes. PNAS, 109, 8382-8387.

Examples

set.seed(8192)
samp <- 1000
x <- rnorm.mixt(n=samp, mus=0, sigmas=1, props=1)
y <- rnorm.mixt(n=samp, mus=0, sigmas=1, props=1)
kde.test(x1=x, x2=y)$pvalue   ## accept H0: f1=f2

data(crabs, package="MASS")
x1 <- crabs[crabs$sp=="B", c(4,6)]
x2 <- crabs[crabs$sp=="O", c(4,6)]
kde.test(x1=x1, x2=x2)$pvalue  ## reject H0: f1=f2

ks documentation built on June 8, 2025, 9:38 p.m.