pfactor.bernstein: Estimation of Optimal p-factor of Distributional Support...
In lmomco: L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions

pfactor.bernstein

R Documentation

Estimation of Optimal p-factor of Distributional Support Estimation for Smoothed Quantiles from the Bernstein or Kantorovich Polynomials

Description

Compute the optimal p-factor through numerical integration of the smoothed empirical quantile function to estimate the L-moments of the distribution. This function attempts to report an optimal “p-factor” (author's term) for the given parent distribution in para based on estimating the crossing of the origin of an error between the given L-moment ratio \tau_r for 3, 4, and 5 that will come from either the distribution parameter object or given as an argument in lmr.dist. The estimated support of the distribution is that shown by Turnbull and Ghosh (2014) and is computed as follows

\biggl(x_{0:n},\: x_{n+1:n}\biggr) = \biggl(x_{1:n} - \frac{(x_{2:n} - x_{1:n})}{(1 - p)^{-2} - 1},\: x_{n:n} + \frac{(x_{n:n} - x_{n-1:n})}{(1 - p)^{-2} - 1}\biggr)\mbox{,}

where p is the p-factor. The support will honor natural bounds if given by either fix.lower or fix.upper. The polynomial type for smooth is provided in poly.type. These three arguments are the same as those for dat2bernqua and lmoms.bernstein. The statistic type used to measure central tendency of the errors for the nsim simulations per p. The function has its own hardwired p-factors to compute but these can be superseded by the pfactors argument. The p.lo and p.hi are the lower and upper bounds to truncate on immediately after the p-factors to use are assembled. These are made for three purposes: (1) protection against numerical problems for mathematical upper limits (unity), (2) to potentially provide for much faster execution if the user already knows the approximate optimal value for the p-factor, and (3) to potentially use this function in a direct optimization framework using the R functions optim or uniroot. It is strongly suggested to keep plot.em set so the user can inspect the computations.

Usage

pfactor.bernstein(para, x=NULL, n=NULL,
                        bern.control=NULL,
                        poly.type=c("Bernstein", "Kantorovich"),
                        stat.type=c("Mean", "Median"),
                        fix.lower=NULL, fix.upper=NULL,
                        lmr.dist=NULL, lmr.n=c("3", "4", "5"),
                        nsim=500, plot.em=TRUE, pfactors=NULL,
                        p.lo=.Machine$double.eps, p.hi=1)

Arguments

`para`	A mandatory “parent” distribution defined by a usual lmomco distribution parameter object for a distribution. The simulations are based on this distribution, although optimization for `p` can be set to a different L-moment value by `lmr.dist`.
`x`	An optional vector of data values.
`n`	An optional sample size to run the simulations on. This value is computed by `length(x)` if `x` is provided. If set by argument, then that size supersedes the length of the optional observed sample.
`bern.control`	A `list` that holds `poly.type`, `stat.type`, `fix.lower`, and `fix.upper`. And this list will supersede the respective values provided as separate arguments. There is an implicit `bound.type` of `"Carv"`.
`poly.type`	Same argument as for `dat2bernqua`.
`stat.type`	The central estimation statistic for each p-factor evaluated.
`fix.lower`	Same argument as for `dat2bernqua`.
`fix.upper`	Same argument as for `dat2bernqua`.
`lmr.dist`	This is the value for the `lmr.n` of the distribution in `para` unless explicitly set through `lmr.dist`.
`lmr.n`	The L-moment ratio number for p-factor optimization.
`nsim`	The number of simulations to run. Experiments suggest the default is adequate for reasonably small sample sizes—the simulation count can be reduced as `n` becomes large.
`plot.em`	A logical to trigger the diagnostic plot of the simulated errors and a smooth line through these errors.
`pfactors`	An optional vector of p-factors to loop through for the simulations. The vector computing internall is this is set to `NULL` seems to be more than adequate.
`p.lo`	An computational lower boundary for which the `pfactors` by argument or default are truncated to. The default for `lo` is to be quite small and does no truncate the default `pfactors`.
`p.hi`	An computational upper boundary for which the `pfactors` by argument or default are truncated to. The default for `hi` is unity, which is the true upper limit that results in a 0 slope between the `x_{0:n}` to `x_{1:n}` or `x_{n:n}` to `x_{n+1:n}` order statistics.

Value

An R list or real is returned. If pfactors is a single value, then the single value for the error statistic is returned, otherwise the list described will be. If the returned pfactor is NA, then likely the smooth line did not cross zero and the reason the user should keep plot.em=TRUE and inspect the plot. Perhaps revisions to the arguments will become evident. The contents of the list are

`pfactor`	The estimated value of `p` smoothed by `lowess` that has an error of zero, see `err.stat` as a function of `ps`.
`bounds.type`	`Carv`, which is the same bound type as needed by `dat2bernqua` and `lmoms.bernstein`.
`poly.type`	The given `poly.type`.
`stat.type`	The given `stat.type`. The “Mean” seems to be preferable.
`lmom.type`	A string of the L-moment type: “Tau3”, “Tau4”, “Tau5”.
`fix.lower`	The given fixed lower boundary, which could stay `NULL`.
`fix.upper`	The given fixed upper boundary, which could stay `NULL`.
`source`	An attribute identifying the computational source of the L-moments: “pfactor.bernstein”.
`ps`	The p-factors actually evaluated.
`err.stat`	The error statistic computed by `stat.type` of the simulated `\hat{\tau_r}` by integration provided by `lmoms.bernstein` minus the “true” value `\tau_r` provided by either `para` or given by `lmr.dist` where `r` is `lmr.n`.
`err.smooth`	The `lowess`-smoothed values for `err.stat` and the `pfactor` comes from a linear interpolation of this smooth for the error being zero.

Note

Repeated application of this function for various n would result in the analyst having a vector of n and p (pfactor). The analyst could then fit a regression equation and refine the estimated p(n). For example, a dual-logarithmic regression is suggested lm(log(p)~log(n)).

Also, symmetrical data likely see little benefit from optimizing on the symmetry-measuring L-moments Tau3 and Tau5; the analyst might prefer to optimize on peakedness measured by Tau4.

Note

This function is highly experimental and subject to extreme overhaul. Please contact the author if you are an interested party in Bernstein and Kantorovich polynomials.

Author(s)

W.H. Asquith

References

Turnbull, B.C., and Ghosh, S.K., 2014, Unimodal density estimation using Bernstein polynomials. Computational Statistics and Data Analysis, v. 72, pp. 13–29.

Examples

## Not run: 
pdf("pfactor_exampleB.pdf")
X <- exp(rnorm(200)); para <- parexp(lmoms(X))
# nsim is too small, but makes the following three not take too long
pfactor.bernstein(para, n=20, lmr.n="3", nsim=100, p.lo=.06, p.hi=.3)
pfactor.bernstein(para, n=20, lmr.n="4", nsim=100, p.lo=.06, p.hi=.3)
pfactor.bernstein(para, n=20, lmr.n="5", nsim=100, p.lo=.06, p.hi=.3)
dev.off()

## End(Not run)
## Not run: 
# Try intra-sample p-factor optimization from two perspectives. The 3-parameter
# GEV "over fits" the data and provides the parent.  Then use Tau3 of the fitted
# GEV for peakedness restraint and then use Tau3 of the data. Then repeat but use
# the apparent "exact" value of Tau3 for the true exponential parent.
pdf("pfactor_exampleB.pdf")
lmr <- vec2lmom(c(60,20)); paraA <- parexp(lmr); n <- 40
tr <- lmorph(par2lmom(paraA))$ratios[3]
X <- rlmomco(n, paraA); para <- pargev(lmoms(X))
F <- seq(0.001,0.999, by=0.001)
plot(qnorm(pp(X, a=0.40)), sort(X), type="n", log="y",
      xlab="Standard normal variate", ylab="Quantile",
      xlim=qnorm(range(F)), ylim=range(qlmomco(F,paraA)))
lines(qnorm(F), qlmomco(F, paraA), col=8, lwd=2)
lines(qnorm(F), qlmomco(F, para), lty=2)
points(qnorm(pp(X, a=0.40)), sort(X))

# Make sure to fill in the p-factor when needed!
bc <- list(poly.type = "Bernstein", bound.type="Carv",
           stat.type="Mean", fix.lower=0, fix.upper=NULL, p=NULL)
kc <- list(poly.type = "Kantorovich", bound.type="Carv",
           stat.type="Mean", fix.lower=0, fix.upper=NULL, p=NULL)

# Bernstein
A <- pfactor.bernstein(para,      n=n, nsim=100,              bern.control=bc)
B <- pfactor.bernstein(para, x=X, n=n, nsim=100,              bern.control=bc)
C <- pfactor.bernstein(para,      n=n, nsim=100, lmr.dist=tr, bern.control=bc)
D <- pfactor.bernstein(para, x=X, n=n, nsim=100, lmr.dist=tr, bern.control=bc)
plot(qnorm(pp(X, a=0.40)), sort(X), type="n", log="y",
      xlab="Standard normal variate", ylab="Quantile",
      xlim=qnorm(range(F)), ylim=range(qlmomco(F,paraA)))
lines(qnorm(F), qlmomco(F, paraA), col=8, lwd=2)
lines(qnorm(F), qlmomco(F, para), lty=2)
points(qnorm(pp(X, a=0.40)), sort(X))
      bc$p <- A$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=bc), col=2)
      bc$p <- B$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=bc), col=3)
      bc$p <- C$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=bc), col=2, lty=2)
      bc$p <- D$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=bc), col=3, lty=2)
# Kantorovich
A <- pfactor.bernstein(para,      n=n, nsim=100,              bern.control=kc)
B <- pfactor.bernstein(para, x=X, n=n, nsim=100,              bern.control=kc)
C <- pfactor.bernstein(para,      n=n, nsim=100, lmr.dist=tr, bern.control=kc)
D <- pfactor.bernstein(para, x=X, n=n, nsim=100, lmr.dist=tr, bern.control=kc)
plot(qnorm(pp(X, a=0.40)), sort(X), type="n", log="y",
      xlab="Standard normal variate", ylab="Quantile",
      xlim=qnorm(range(F)), ylim=range(qlmomco(F,paraA)))
lines(qnorm(F), qlmomco(F, paraA), col=8, lwd=2)
lines(qnorm(F), qlmomco(F, para), lty=2)
points(qnorm(pp(X, a=0.40)), sort(X))
      kc$p <- A$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=kc), col=2)
      kc$p <- B$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=kc), col=3)
      kc$p <- C$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=kc), col=2, lty=2)
      kc$p <- D$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=kc), col=3, lty=2)
dev.off()

## End(Not run)
## Not run: 
X <- exp(rnorm(200)); para <- parexp(lmoms(X))
"pfactor.root" <- function(para, p.lo, p.hi, ...) {
    afunc <- function(p, para=NULL, x=NULL, ...) {
      return(pfactor.bernstein(para=para, x=x, pfactors=p, ...)) }
    rt <- uniroot(afunc, c(p.lo, p.hi),
                  tol=0.001, maxiter=30, nsim=500, para=para, ...)
    return(rt)
}
pfactor.root(para, 0.05, 0.15, n=10, lmr.n="4")
pfactor.bernstein(para, n=10, lmr.n="4", nsim=200, p.lo=.05, p.hi=.15)

## End(Not run)

lmomco documentation built on May 29, 2024, 10:06 a.m.

lmomco index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lmomco
L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions

pfactor.bernstein: Estimation of Optimal p-factor of Distributional Support...
In lmomco: L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions

Estimation of Optimal p-factor of Distributional Support Estimation for Smoothed Quantiles from the Bernstein or Kantorovich Polynomials

Description

Usage

Arguments

Value

Note

Note

Author(s)

References

See Also

Examples

Related to pfactor.bernstein in lmomco...

R Package Documentation

Browse R Packages

We want your feedback!

lmomco L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions

pfactor.bernstein: Estimation of Optimal p-factor of Distributional Support... In lmomco: L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions

Estimation of Optimal p-factor of Distributional Support Estimation for Smoothed Quantiles from the Bernstein or Kantorovich Polynomials

Description

Usage

Arguments

Value

Note

Note

Author(s)

References

See Also

Examples

Related to pfactor.bernstein in lmomco...

R Package Documentation

Browse R Packages

We want your feedback!

lmomco
L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions

pfactor.bernstein: Estimation of Optimal p-factor of Distributional Support...
In lmomco: L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions