Kernel density based global two-sample comparison test

Share:

Description

Kernel density based global two-sample comparison test for 1- to 6-dimensional data.

Usage

1
2
kde.test(x1, x2, H1, H2, h1, h2, psi1, psi2, var.fhat1, var.fhat2, 
    binned=FALSE, bgridsize, verbose=FALSE, pilot="dscalar")

Arguments

x1,x2

vector/matrix of data values

H1,H2,h1,h2

bandwidth matrices/scalar bandwidths. If these are missing, Hpi.kfe, hpi.kfe is called by default.

psi1,psi2

zero-th order kernel functional estimates

var.fhat1,var.fhat2

sample variance of KDE estimates evaluated at x1, x2

binned

flag for binned estimation. Default is FALSE.

bgridsize

vector of binning grid sizes

verbose

flag to print out progress information. Default is FALSE.

pilot

"dscalar" = single pilot bandwidth (default)
"dunconstr" = single unconstrained pilot bandwidth

Details

The null hypothesis is H_0: f_1 = f_2 where f_1, f_2 are the respective density functions. The measure of discrepancy is the integrated squared error (ISE) int [ f_1(x) - f_2(x)]^2 dx. If we rewrite this as T = psi_0,1 - psi_0,12 - psi_0,21 + psi_0,2 where psi_0,uv = int f_u(x) f_v(x) dx, then we can use kernel functional estimators. This test statistic has a null distribution which is asymptotically normal, so no bootstrap resampling is required to compute an approximate p-value.

If H1,H2 are missing then the plug-in selector Hpi.kfe is automatically called by kde.test to estimate the functionals with kfe(, deriv.order=0). Likewise for missing h1,h2.

As of ks 1.8.8, kde.test(,binned=TRUE) invokes binned estimation for the computation of the bandwidth selectors, and not the test statistic and p-value.

Value

A kernel two-sample global significance test is a list with fields:

Tstat

T statistic

zstat

z statistic - normalised version of Tstat

pvalue

p-value of the double sided test

mean,var

mean and variance of null distribution

var.fhat1,var.fhat2

sample variances of KDE values evaluated at data points

n1,n2

sample sizes

H1,H2

bandwidth matrices

psi1,psi12,psi21,psi2

kernel functional estimates

References

Duong, T., Goud, B. & Schauer, K. (2012) Closed-form density-based framework for automatic detection of cellular morphology changes. PNAS, 109, 8382-8387.

See Also

kde.local.test

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
set.seed(8192)
samp <- 1000
x <- rnorm.mixt(n=samp, mus=0, sigmas=1, props=1)
y <- rnorm.mixt(n=samp, mus=0, sigmas=1, props=1)
kde.test(x1=x, x2=y)$pvalue   ## accept H0: f1=f2

library(MASS)
data(crabs)
x1 <- crabs[crabs$sp=="B", c(4,6)]
x2 <- crabs[crabs$sp=="O", c(4,6)]
kde.test(x1=x1, x2=x2)$pvalue  ## reject H0: f1=f2

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.