Description Usage Arguments Details Value Note Author(s) References See Also Examples
Function for creating frequency tables for random variate generators. Thus a histogram is computed and the bin counts are stored in an array which can be used to visualize possible defects of the pseudo-random variate generator and run goodness-of-fit tests.
The function only works for generators for univariate distributions.
1 2 |
n |
sample size for one repetition (>=100). |
rep |
number of repetitions. |
rdist |
random variate generator for a univariate distribution. |
qdist |
quantile function for the distribution. |
pdist |
cumulative distribution function for distribution. |
... |
parameters to be passed to |
breaks |
one of:
|
trunc |
boundaries of truncated domain. (optional) |
exactu |
logical.
If |
plot |
logical. If |
rvgt.ftable
returns tables of bin counts similar to the
hist
function. Bins can be specified either by the
number of break points between the cells of the histogram, or by a
list of break points in the u-scale.
In the former case the break points are constructed such that all bins
of the histogram have equal probability for the distribution under the
null hypothesis, i.e., the break points are equidistributed in the
u-scale using the formula u_i=i/(breaks-1) where
i=0,…,breaks-1.
When the quantile function qdist
is given, then these points
are transformed into breaking points in the x-scale using
qdist
(u_i). Thus the histogram can be computed directly
for random points X that are generated by means of rdist
.
Otherwise the cumulative distribution function pdist
must be
given. If exactu
is TRUE
,
then all non-uniform random points X are first
transformed into uniformly distributed random numbers
U=pdist
(X) for which the histogram is created.
This is slower than directly using X but it is numerically more
robust as round-off error in qdist
have much more influence
than those in pdist
.
If trunc
is given, then functions qdist
and
pdist
are rescaled to this given domain. It is recommended to
provide pdist
even when qdist
is given.
If exactu
is FALSE
and the quantile function
qdist
is missing, then the first sample of size n
is
used to estimate the quantiles for the given break points using
function quantile
. The break points in u-scale are
then recomputed using these quantiles by means of the given
probability function pdist
.
This is usually (much) faster than calling pdist
on each
generated point. However, the break points are slightly
perturbated (but this does not effect the correctness of the
frequency table).
The argument rep
allows to create multiple such arrays of bin
counts and store these in one table. Thus has two advantages:
It allows for huge total sample sizes that would otherwise exceed the available memory, and
it can be used to visualize test results for increasing sample sizes, or
allows for a two-level test.
For discrete distributions function pdist
must be given
and both arguments qdist
and exactu
are ignored.
Moreover, the given break points have to be adjusted according to the
probability function of the discrete distribution. In particular this
means that bins have to be collapsed when the probability of some
number is larger than difference of break points in u-scale.
Thus there resulting tables may contain less break points than
requested.
The type of distribution (continuous or discrete) is autodetected by the function.
An object of class "rvgt.ftable"
which is a list with components:
n |
sample size. |
rep |
number of repetitions. |
ubreaks |
an array of break points in u-scale. |
xbreaks |
an array of break points in x-scale. |
count |
a matrix of |
dtype |
a string that contains the type of the distribution:
|
It is important that all given functions – rdist
,
qdist
, and pdist
– accept the same arguments passed to
rvgt.ftable
via ...
.
The random variate generator rdist
can alternatively be a
generator object form the
Runuran package.
Sougata Chaudhuri sgtchaudhuri@gmail.com, Josef Leydold josef.leydold@wu.ac.at
W. H\"ormann, J. Leydold, and G. Derflinger (2004): Automatic Nonuniform Random Variate Generation. Springer-Verlag, Berlin Heidelberg
See plot.rvgt.ftable
for the syntax of the plotting
method.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | ## Create a frequency table for normal distribution with mean 1 and
## standard deviation 2. Number of bins should be 50.
## Use a sample of size of 5 times 10^5 random variates.
ft <- rvgt.ftable(n=1e5,rep=5, rdist=rnorm,qdist=qnorm, breaks=51, mean=1,sd=2)
## Show histogram
plot(ft)
## Run a chi-square test
rvgt.chisq(ft)
## The following allows to plot a histgram in a single call.
rvgt.ftable(n=1e5,rep=5, rdist=rnorm,qdist=qnorm, plot=TRUE)
## Use the cumulative distribution function when the quantile function
## is not available or if its round-off errors have serious impact.
ft <- rvgt.ftable(n=1e5,rep=5, rdist=rnorm,pdist=pnorm )
plot(ft)
## Create a frequency table for the normal distribution with
## non-equidistributed break points
ft <- rvgt.ftable(n=1e5,rep=5, rdist=rnorm,qdist=qnorm, breaks=1/(1:100))
plot(ft)
## A (naive) generator for a truncated normal distribution
rdist <- function(n) {
x <- numeric(n)
for (i in 1:n){ while(TRUE){ x[i] <- rnorm(1); if (x[i]>1) break} }
return(x)
}
ft <- rvgt.ftable(n=1e3,rep=5, rdist=rdist,
pdist=pnorm, qdist=qnorm, trunc=c(1,Inf))
plot(ft)
## An example for a discrete distribution
ft <- rvgt.ftable(n=1e5,rep=1, rdist=rgeom,pdist=pgeom, prob=0.123)
plot(ft)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.