pistar.uv: The Mixture Index of Fit for Univariate Distributions
In jmedzihorsky/pistar: Rudas, Clogg and Lindsay Mixture Index of Fit

pistar.uv

R Documentation

The Mixture Index of Fit for Univariate Distributions

Description

pistar.uv is used to estimate the pi* index of fit for any user-supplied univariate distribution. The user must supply a probability mass or density function that inputs the data as the first argument and the parameters as the next arguments. See ‘Details’ for the estimation procedures. Standard errors available via jackknife as suggested by Dayton (2003).

Usage

pistar.uv(data, dfn, n_par = NULL, inits = NULL, discrete = FALSE, 
          freq = FALSE, lower = NULL, upper = NULL, jack = FALSE, 
          method = "Nelder-Mead", control = list(maxit = 2000), 
          verbose = TRUE, npk = 1e3, eps = .Machine$double.neg.eps^0.5)

Arguments

`data`	a vector or a frequency table.
`dfn`	function: probability mass or density function that inputs the data as the first argument and the parameters as the arguments that immediately follow it.
`n_par`	numeric: number of parameters. Either `n_par` or `inits` must be supplied. If only `n_par` is supplied initial values are generated internally and might not always be suitable.
`inits`	a vector or list of initial values of parameters supplied to `optim`. If named the parameter names are preserved in the output.
`discrete`	logical: is the distribution discrete?
`freq`	logical: is the supplied data a frequency table? Relevant only if `discrete` is `TRUE`.
`lower`	numeric: a vector of lower bounds for parameters.
`upper`	numeric: a vector of upper bounds for parameters.
`jack`	logical: perform jackknife?
`method`	`method` argument for optim. Default `"Brent"` for mono-parameter functions and `"Nelder-Mead"` for multi-parameter functions. See `optim` for details on methods.
`control`	list supplied to `optim`, see `optim`.
`verbose`	logical: print during estimation?
`npk`	an integer indicating the number of points for `density`; used if `discrete = FALSE`.
`eps`	numeric: the smallest number practically indistinguishable from 0. Used only if `discrete = TRUE`.

Details

The general procedure for discrete and continuous distributions is the same: a general purpose optimization method is used to find such values of the parameters of the supplied distribution that minimize the following quantity: 1 minus the inverse of the ratio of the model and the observed density at the point of their supports where this ratio is highest. This quantity is pi*.

The procedure for discrete distributions differs from the one for the continuous distributions in the method used to obtain the observed density. In the discrete case the observed frequencies are used for the observed density. In the continuous case a kernel density is estimated using density with gaussian kernel.

Value

Object of class "Pistar", and "PistarUV" and depending on the discrete argument of the function either "PistarDUV" or "PistarCUV" with the following slots:

`call`	the matched call.
`pistar`	a list of estimated values of the mixture index of fit. est for the supplied data. jack vector of values from jackknife.
`pred`	if `discrete = TRUE` a list of predicted values with three items: model the model component multiplied by (1-pi)* unres the unrestricted component multiplied by pi* combi the two-point mixture, i.e. (1-pi)M + piU if `discrete = FALSE` the list also contains three components with the same names, but they contain the values of the scaled densities at `npk` (i.e. by default `1e3`) points.
`data`	the supplied data.
`param`	a list of parameter estimates of interest: est the estimated values. jack from each jackknife replication.
`meth`	`method` of `optim` used.
`conv`	a list of integer codes from `optim` that indicate convergence of the optimization algorithm. Any value that is not 0 suggests problems. See `optim` for details. est from estimation with the supplied data. jack from jackknife replications.
`mess`	a list of messages pased from `optim`. See `optim` for details. est From the main estimation. jack From jackknife replications.

Note

The application of the mixture index of fit for discrete distributions was proposed by Dayton (2003).

Author(s)

Juraj Medzihorsky

References

Dayton, C. M. (2003) Applications and computational strategies for the two-point mixture index of fit. British Journal of Mathematical & Statistical Psychology, 56, 1-13.

Examples

	#	(1)	discrete
	#	simulate data
	set.seed(1989)
	e <- c(rpois(1e3, 2), rpois(2e2, 5))

	#	make a frequency table
	te <- freq.table(e)

	#	define a funcion for a slice from Poisson
	md <- function(x, l, lo=0, up=5){
		z <- dpois(x, l)
		z[x<lo] <- 0
		z[x>up] <- 0
		z <- z/sum(z)
		return(z)
	}

	#	find pi*
	pe <- pistar(proc="uv", data=te, dfn=md, n_par=1,
				 discrete=TRUE, freq=TRUE, jack=FALSE)

	pe

	summary(pe)

	plot(pe)


	#	(2)	continuous
	#	simulate data
	set.seed(1989)
	y <- c(rnorm(1e2, 0, 2), runif(2e1, -1, 1))

	#	find pi* and parameters for normal dist.
	py <- pistar(proc="uv", data=y, dfn=dnorm, n_par=2, discrete=FALSE, 
				 jack=FALSE)

	py

	summary(py)

	plot(py)

jmedzihorsky/pistar documentation built on June 4, 2022, 9:58 a.m.