gof_test: Tests for the multivariate goodness-of-fit problem

View source: R/gof_test.R

gof_testR Documentation

Tests for the multivariate goodness-of-fit problem

Description

This function runs a number of goodness-of-fit tests using Rcpp and parallel computing.

Usage

gof_test(
  x,
  pnull,
  rnull,
  phat = function(x) -99,
  dnull = function(x) -99,
  TS,
  TSextra,
  rate = 0,
  nbins = c(5, 5),
  Ranges = matrix(c(-Inf, Inf, -Inf, Inf), 2, 2),
  minexpcount = 5,
  maxProcessor,
  doMethods,
  B = 5000,
  ReturnTSextra = FALSE
)

Arguments

x

a matrix with the data set

pnull

cdf under the null hypothesis

rnull

routine to generate data under the null hypothesis

phat

=function(x) -99, function to estimate parameters from the data, or -99 if no parameters are estimated

dnull

=function(x) -99, density function under the null hypothesis, if available, or -99 if missing

TS

user supplied function to find test statistics, if any.

TSextra

(optional) list passed to TS, if needed.

rate

=0 rate of Poisson if sample size is random, 0 if sample size is fixed

nbins

=c(5, 5) number of bins for chi-square tests

Ranges

=matrix(c(-Inf, Inf, -Inf, Inf),2,2), a 2x2 matrix with lower and upper bounds, if any, for chi-square tests

minexpcount

=5 minimal expected bin count required

maxProcessor

number of processors to use in parallel processing.

doMethods

a vector of codes for the methods to include. If ="all", it does all the included tests. #missing it runs a default selection. I

B

=5000 number of simulation runs. If B=0 the routine returns the test statistics.

ReturnTSextra

=FALSE, should setup info be returned?

Details

For details on the usage of this routine consult the vignette with vignette("MDgof","MDgof")

Value

A list with vectors of test statistics and p.values

Examples

# All examples are run with B=10 and maxProcessor=1 to pass CRAN checks.
# This is obviously MUCH TO SMALL for any real usage.
# Tests to see whether data comes from a bivariate standard normal distribution, 
# without parameter estimation.
rnull=function() mvtnorm::rmvnorm(100, c(0, 0))
x=rnull()
pnull=function(x) {
  if(!is.matrix(x)) return(mvtnorm::pmvnorm(rep(-Inf, 2), x))
  apply(x, 1, function(x) mvtnorm::pmvnorm(rep(-Inf, 2), x))
}
gof_test(x, pnull, rnull, B=10, maxProcessor = 1)
# Same as above, but now with density included
dnull=function(x) {
  if(!is.matrix(x)) return(mvtnorm::dmvnorm(x))
  apply(x, 1, function(x) mvtnorm::dmvnorm(x))
}
gof_test(x, pnull, rnull, dnull=dnull, B=20, maxProcessor = 1)
# Tests to see whether data comes from a standard normal distribution, 
# with mean parameter estimated.
rnull=function(p) mvtnorm::rmvnorm(100, p)
x=rnull(c(0,1))
pnull=function(x,p) {
  if(!is.matrix(x)) return(mvtnorm::pmvnorm(rep(-Inf, 2), x, mean=p))
  apply(x, 1, function(x) mvtnorm::pmvnorm(rep(-Inf, 2), x, mean=p))
}
dnull=function(x, p) {
  if(!is.matrix(x)) return(mvtnorm::dmvnorm(x, mean=p))
  apply(x, 1, function(x) mvtnorm::dmvnorm(x, mean=p))
}
phat=function(x) apply(x, 2, mean)
gof_test(x, pnull, rnull, dnull=dnull, phat=phat,B=20, maxProcessor = 1)
# Example of a discrete model, with parameter estimation
# X~Bin(10, p1), Y|X=x~Bin(5, p2+x/100)
rnull=function(p) {
  x=rbinom(1000, 10, p[1])
  y=rbinom(1000, 5, p[2]+x/100)
  MDgof::sq2rec(table(x, y))
}
pnull=function(x, p) {
  f=function(x) sum(dbinom(0:x[1], 10, p[1])*pbinom(x[2], 5, p[2]+0:x[1]/100))
  if(!is.matrix(x)) x=rbind(x)
  apply(x, 1, f)
}
phat=function(x) {
  tx=tapply(x[,3], x[,1], sum)
  p1=mean(rep(as.numeric(names(tx)), times=tx))/10
  ty=tapply(x[,3], x[,2], sum)
  p2=mean(rep(as.numeric(names(ty)), times=ty))/5-p1/10
  c(p1, p2)
}
x=rnull(c(0.5, 0.5))
gof_test(x, pnull, rnull, phat=phat,B=10, maxProcessor = 1)

MDgof documentation built on Feb. 13, 2026, 1:06 a.m.