Kernel density based global two-sample comparison test for 1- to 6-dimensional data.

1 2 |

`x1,x2` |
vector/matrix of data values |

`H1,H2,h1,h2` |
bandwidth matrices/scalar bandwidths. If these are
missing, |

`psi1,psi2` |
zero-th order kernel functional estimates |

`var.fhat1,var.fhat2` |
sample variance of KDE estimates evaluated at x1, x2 |

`binned` |
flag for binned estimation. Default is FALSE. |

`bgridsize` |
vector of binning grid sizes |

`verbose` |
flag to print out progress information. Default is FALSE. |

`pilot` |
"dscalar" = single pilot bandwidth (default) |

The null hypothesis is *H_0: f_1 = f_2* where *f_1, f_2*
are the respective density functions. The measure of discrepancy is
the integrated squared error (ISE)
*int [ f_1(x) - f_2(x)]^2 dx*. If
we rewrite this as *T = psi_0,1 - psi_0,12 - psi_0,21 + psi_0,2*
where *psi_0,uv = int f_u(x) f_v(x) dx*,
then we can use kernel functional estimators. This test statistic has a null
distribution which is asymptotically normal, so no bootstrap
resampling is required to compute an approximate p-value.

If `H1,H2`

are missing then the plug-in selector `Hpi.kfe`

is automatically called by `kde.test`

to estimate the
functionals with `kfe(, deriv.order=0)`

. Likewise for missing
`h1,h2`

.

As of ks 1.8.8, `kde.test(,binned=TRUE)`

invokes binned
estimation for the computation of the bandwidth selectors, and not the
test statistic and p-value.

A kernel two-sample global significance test is a list with fields:

`Tstat` |
T statistic |

`zstat` |
z statistic - normalised version of Tstat |

`pvalue` |
p-value of the double sided test |

`mean,var` |
mean and variance of null distribution |

`var.fhat1,var.fhat2` |
sample variances of KDE values evaluated at data points |

`n1,n2` |
sample sizes |

`H1,H2` |
bandwidth matrices |

`psi1,psi12,psi21,psi2` |
kernel functional estimates |

Duong, T., Goud, B. & Schauer, K. (2012) Closed-form density-based framework for automatic detection of cellular morphology changes. *PNAS*, **109**, 8382-8387.

1 2 3 4 5 6 7 8 9 10 11 | ```
set.seed(8192)
samp <- 1000
x <- rnorm.mixt(n=samp, mus=0, sigmas=1, props=1)
y <- rnorm.mixt(n=samp, mus=0, sigmas=1, props=1)
kde.test(x1=x, x2=y)$pvalue ## accept H0: f1=f2
library(MASS)
data(crabs)
x1 <- crabs[crabs$sp=="B", c(4,6)]
x2 <- crabs[crabs$sp=="O", c(4,6)]
kde.test(x1=x1, x2=x2)$pvalue ## reject H0: f1=f2
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.