Description Usage Arguments Value Notes on Fleishman Method Notes on Headrick's Method Reasons for Function Errors References See Also Examples
View source: R/calc_lower_skurt.R
This function calculates the lower boundary of standardized kurtosis for Fleishman's Third-Order (method
= "Fleishman",
doi: 10.1007/BF02293811) or Headrick's Fifth-Order (method
= "Polynomial", doi: 10.1016/S0167-9473(02)00072-5), given values of skewness and standardized fifth and sixth
cumulants. It uses nleqslv
to search for solutions to the multi-constraint Lagrangean expression in either
fleish_skurt_check
or poly_skurt_check
. When Headrick's method
is used (method
= "Polynomial"), if no solutions converge and a vector of sixth cumulant correction values (Six
) is
provided, the smallest value is found that yields solutions. Otherwise, the function stops with an error.
Each set of constants is checked for a positive correlation with the underlying normal variable
(using power_norm_corr
) and a valid power method pdf (using pdf_check
).
If the correlation is <= 0, the signs of c1 and c3 are reversed (for method
= "Fleishman"),
or c1, c3, and c5 (for method
= "Polynomial"). It will return a kurtosis value with constants that yield in invalid pdf
if no other solutions can be found (valid.pdf
= "FALSE"). If a vector of kurtosis correction values (Skurt
) is provided, the function
finds the smallest value that produces a kurtosis with constants that yield a valid pdf. If valid pdf constants
still can not be found, the original invalid pdf constants (calculated without a correction) will be provided. If no solutions
can be found, an error is given and the function stops. Please note that this function can take
considerable computation time, depending on the number of starting values (n) and lengths of kurtosis (Skurt
) and sixth cumulant
(Six
) correction vectors. Different seeds should be tested to see if a lower boundary can be found.
1 2 3 |
method |
the method used to find the constants. "Fleishman" uses a third-order polynomial transformation and requires only a skewness input. "Polynomial" uses Headrick's fifth-order transformation and requires skewness plus standardized fifth and sixth cumulants. |
skews |
the skewness value |
fifths |
the standardized fifth cumulant (if |
sixths |
the standardized sixth cumulant (if |
Skurt |
a vector of correction values to add to the lower kurtosis boundary if the constants yield an invalid pdf,
ex: |
Six |
a vector of correction values to add to the sixth cumulant if no solutions converged,
ex: |
xstart |
initial value for root-solving algorithm (see |
seed |
the seed value for random starting value generation (default = 104) |
n |
the number of initial starting values to use (default = 50). More starting values require more calculation time. |
A list with components:
Min
a data.frame containing the skewness, fifth and sixth standardized cumulants (if method
= "Polynomial"), constants,
a valid.pdf column indicating whether or not the constants generate a valid power method pdf, and the minimum value of standardized
kurtosis ("skurtosis")
C
a data.frame of valid power method pdf solutions, containing the skewness, fifth and sixth standardized cumulants
(if method
= "Polynomial"), constants, a valid.pdf column indicating TRUE, and all values of standardized kurtosis ("skurtosis").
If the Lagrangean equations yielded valid pdf solutions, this will also include the lambda values, and for method
= "Fleishman", the
Hessian determinant and a minimum column indicating TRUE if the solutions give a minimum kurtosis. If the Lagrangean equations
yielded invalid pdf solutions, this data.frame contains constants calculated from find_constants
using the kurtosis correction.
Invalid.C
if the Lagrangean equations yielded invalid pdf solutions, a data.frame containing the skewness, fifth and sixth
standardized cumulants (if method
= "Polynomial"), constants, lambda values, a valid.pdf column indicating FALSE, and all values
of standardized kurtosis ("skurtosis"). If method
= "Fleishman", also the
Hessian determinant and a minimum column indicating TRUE if the solutions give a minimum kurtosis.
Time
the total calculation time in minutes
start
a matrix of starting values used in root-solver
SixCorr1
if Six is specified, the sixth cumulant correction required to achieve converged solutions
SkurtCorr1
if Skurt is specified, the kurtosis correction required to achieve a valid power method pdf (or the maximum value
attempted if no valid pdf solutions could be found)
The Fleishman method can not generate valid power method distributions with a ratio of skew^2/skurtosis > 9/14, where skurtosis is kurtosis - 3. This prevents the method from being used for any of the Chi-squared distributions, which have a constant ratio of skew^2/skurtosis = 2/3.
Symmetric Distributions: All symmetric distributions (which have skew = 0) possess the same lower kurtosis boundary. This is solved for
using optimize
and the equations in Headrick & Sawilowsky (2002, doi: 10.3102/10769986025004417).
The result will always be: c0 = 0, c1 = 1.341159,
c2 = 0, c3 = -0.1314796, and minimum standardized kurtosis = -1.151323. Note that this set of constants does NOT generate a valid
power method pdf. If a Skurt
vector of kurtosis correction values is provided, the function will find the smallest addition that yields a
valid pdf. This value is 1.16, giving a lower kurtosis boundary of 0.008676821.
Asymmetric Distributions: Due to the square roots involved in the calculation of the lower kurtosis boundary (see Headrick & Sawilowsky, 2002), this function uses the absolute value of the skewness. If the true skewness is less than zero, the signs on the constants c0 and c2 are switched after calculations (which changes skewness from positive to negative without affecting kurtosis).
Verification of Minimum Kurtosis: Since differentiability is a local property, it is possible to obtain a local, instead of a global, minimum.
For the Fleishman method, Headrick & Sawilowsky (2002) explain that since the equation for kurtosis is not "quasiconvex on the domain
consisting only of the nonnegative orthant," second-order conditions must be verified. The solutions for
lambda, c1, and c3 generate a global kurtosis minimum if and only if the determinant of a bordered Hessian is less than zero. Therefore,
this function first obtains the solutions to the Lagrangean expression in fleish_skurt_check
for a given
skewness value. These are used to calculate the standardized kurtosis, the constants c1 and c3, and the Hessian determinant
(using fleish_Hessian
). If this determinant is less than zero, the kurtosis is indicated as a minimum.
The constants c0, c1, c2, and c3 are checked to see if they yield a continuous variable with a positive correlation with the generating
standard normal variable (using power_norm_corr
). If not, the signs of c1 and c3 are switched.
The final set of constants is checked to see if they generate a valid power method pdf (using pdf_check
).
If a Skurt
vector of kurtosis correction values is provided, the function will find the smallest value that yields a
valid pdf.
The sixth cumulant correction vector (Six
) may be used in order to aid in obtaining solutions which converge. The calculation methods
are the same for symmetric or asymmetric distributions, and for positive or negative skew.
Verification of Minimum Kurtosis: For the fifth-order approximation, Headrick (2002, doi: 10.1016/S0167-9473(02)00072-5) states
"it is assumed that the hypersurface of the objective function [for the kurtosis
equation] has the appropriate (quasiconvex) configuration." This assumption alleviates the need to check second-order conditions.
Headrick discusses steps he took to verify the kurtosis solution was in fact a minimum, including: 1) substituting the constant solutions
back into the 1st four Lagrangean constraints to ensure the results are zero, 2) substituting the skewness, kurtosis solution, and
standardized fifth and sixth cumulants back into the fifth-order equations to ensure the same constants are produced
(i.e. using find_constants
), and 3) searching for values below the kurtosis solution that solve the
Lagrangean equation. This function ensures steps 1 and 2 by the nature of the root-solving algorithm of nleqslv
.
Using a sufficiently large n (and, if necessary, executing the function for different seeds) makes step 3 unnecessary.
The most likely cause for function errors is that no solutions to fleish_skurt_check
or
poly_skurt_check
converged. If this happens,
the simulation will stop. Possible solutions include: a) increasing the number of initial starting values (n
),
b) using a different seed, or c) specifying a Six
vector of sixth cumulant correction values (for method
= "Polynomial").
If the standardized cumulants are obtained from calc_theory
, the user may need to use rounded values as inputs (i.e.
skews = round(skews, 8)
). Due to the nature of the integration involved in calc_theory
, the results are
approximations. Greater accuracy can be achieved by increasing the number of subdivisions (sub
) used in the integration
process. For example, in order to ensure that skew is exactly 0 for symmetric distributions.
Fleishman AI (1978). A Method for Simulating Non-normal Distributions. Psychometrika, 43, 521-532. doi: 10.1007/BF02293811.
Hasselman B (2018). nleqslv: Solve Systems of Nonlinear Equations. R package version 3.3.2. https://CRAN.R-project.org/package=nleqslv
Headrick TC (2002). Fast Fifth-order Polynomial Transforms for Generating Univariate and Multivariate Non-normal Distributions. Computational Statistics & Data Analysis, 40(4):685-711. doi: 10.1016/S0167-9473(02)00072-5. (ScienceDirect)
Headrick TC (2004). On Polynomial Transformations for Simulating Multivariate Nonnormal Distributions. Journal of Modern Applied Statistical Methods, 3(1), 65-71. doi: 10.22237/jmasm/1083370080.
Headrick TC, Kowalchuk RK (2007). The Power Method Transformation: Its Probability Density Function, Distribution Function, and Its Further Use for Fitting Data. Journal of Statistical Computation and Simulation, 77, 229-249. doi: 10.1080/10629360600605065.
Headrick TC, Sawilowsky SS (1999). Simulating Correlated Non-normal Distributions: Extending the Fleishman Power Method. Psychometrika, 64, 25-35. doi: 10.1007/BF02294317.
Headrick TC, Sawilowsky SS (2002). Weighted Simplex Procedures for Determining Boundary Points and Constants for the Univariate and Multivariate Power Methods. Journal of Educational and Behavioral Statistics, 25, 417-436. doi: 10.3102/10769986025004417.
Headrick TC, Sheng Y, & Hodis FA (2007). Numerical Computing and Graphics for the Power Method Transformation Using Mathematica. Journal of Statistical Software, 19(3), 1 - 17. doi: 10.18637/jss.v019.i03.
nleqslv
, fleish_skurt_check
,
fleish_Hessian
, poly_skurt_check
,
power_norm_corr
, pdf_check
,
find_constants
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | # Normal distribution with Fleishman transformation
calc_lower_skurt("Fleishman", 0, 0, 0)
## Not run:
# This example takes considerable computation time.
# Reproduce Headrick's Table 2 (2002, p.698): note the seed here is 104.
# If you use seed = 1234, you get higher Headrick kurtosis values for V7 and V9.
# This shows the importance of trying different seeds.
options(scipen = 999)
V1 <- c(0, 0, 28.5)
V2 <- c(0.24, -1, 11)
V3 <- c(0.48, -2, 6.25)
V4 <- c(0.72, -2.5, 2.5)
V5 <- c(0.96, -2.25, -0.25)
V6 <- c(1.20, -1.20, -3.08)
V7 <- c(1.44, 0.40, 6)
V8 <- c(1.68, 2.38, 6)
V9 <- c(1.92, 11, 195)
V10 <- c(2.16, 10, 37)
V11 <- c(2.40, 15, 200)
G <- as.data.frame(rbind(V1, V2, V3, V4, V5, V6, V7, V8, V9, V10, V11))
colnames(G) <- c("g1", "g3", "g4")
# kurtosis correction vector (used in case of invalid power method pdf constants)
Skurt <- seq(0.01, 2, 0.01)
# sixth cumulant correction vector (used in case of no converged solutions for
# method = "Polynomial")
Six <- seq(0.1, 10, 0.1)
# Fleishman's Third-order transformation
F_lower <- list()
for (i in 1:nrow(G)) {
F_lower[[i]] <- calc_lower_skurt("Fleishman", G[i, 1], Skurt = Skurt,
seed = 104)
}
# Headrick's Fifth-order transformation
H_lower <- list()
for (i in 1:nrow(G)) {
H_lower[[i]] <- calc_lower_skurt("Polynomial", G[i, 1], G[i, 2], G[i, 3],
Skurt = Skurt, Six = Six, seed = 104)
}
# Approximate boundary from PoisBinOrdNonNor
PBON_lower <- G$g1^2 - 2
# Compare results:
# Note: 1) the lower Headrick kurtosis boundary for V4 is slightly lower than the
# value found by Headrick (-0.480129), and
# 2) the approximate lower kurtosis boundaries used in PoisBinOrdNonNor are
# much lower than the actual Fleishman boundaries, indicating that the
# guideline is not accurate.
Lower <- matrix(1, nrow = nrow(G), ncol = 12)
colnames(Lower) <- c("skew", "fifth", "sixth", "H_valid.skurt",
"F_valid.skurt", "H_invalid.skurt", "F_invalid.skurt",
"PBON_skurt", "H_skurt_corr", "F_skurt_corr",
"H_time", "F_time")
for (i in 1:nrow(G)) {
Lower[i, 1:3] <- as.numeric(G[i, 1:3])
Lower[i, 4] <- ifelse(H_lower[[i]]$Min[1, "valid.pdf"] == "TRUE",
H_lower[[i]]$Min[1, "skurtosis"], NA)
Lower[i, 5] <- ifelse(F_lower[[i]]$Min[1, "valid.pdf"] == "TRUE",
F_lower[[i]]$Min[1, "skurtosis"], NA)
Lower[i, 6] <- min(H_lower[[i]]$Invalid.C[, "skurtosis"])
Lower[i, 7] <- min(F_lower[[i]]$Invalid.C[, "skurtosis"])
Lower[i, 8:12] <- c(PBON_lower[i], H_lower[[i]]$SkurtCorr1,
F_lower[[i]]$SkurtCorr1,
H_lower[[i]]$Time, F_lower[[i]]$Time)
}
Lower <- as.data.frame(Lower)
print(Lower[, 1:8], digits = 4)
# skew fifth sixth H_valid.skurt F_valid.skurt H_invalid.skurt F_invalid.skurt PBON_skurt
# 1 0.00 0.00 28.50 -1.0551 0.008677 -1.3851 -1.1513 -2.0000
# 2 0.24 -1.00 11.00 -0.8600 0.096715 -1.2100 -1.0533 -1.9424
# 3 0.48 -2.00 6.25 -0.5766 0.367177 -0.9266 -0.7728 -1.7696
# 4 0.72 -2.50 2.50 -0.1319 0.808779 -0.4819 -0.3212 -1.4816
# 5 0.96 -2.25 -0.25 0.4934 1.443567 0.1334 0.3036 -1.0784
# 6 1.20 -1.20 -3.08 1.2575 2.256908 0.9075 1.1069 -0.5600
# 7 1.44 0.40 6.00 NA 3.264374 1.7758 2.0944 0.0736
# 8 1.68 2.38 6.00 NA 4.452011 2.7624 3.2720 0.8224
# 9 1.92 11.00 195.00 5.7229 5.837442 4.1729 4.6474 1.6864
# 10 2.16 10.00 37.00 NA 7.411697 5.1993 6.2317 2.6656
# 11 2.40 15.00 200.00 NA 9.182819 6.6066 8.0428 3.7600
Lower[, 9:12]
# H_skurt_corr F_skurt_corr H_time F_time
# 1 0.33 1.16 1.757 8.227
# 2 0.35 1.15 1.566 8.164
# 3 0.35 1.14 1.630 6.321
# 4 0.35 1.13 1.537 5.568
# 5 0.36 1.14 1.558 5.540
# 6 0.35 1.15 1.602 6.619
# 7 2.00 1.17 9.088 8.835
# 8 2.00 1.18 9.425 11.103
# 9 1.55 1.19 6.776 14.364
# 10 2.00 1.18 11.174 15.382
# 11 2.00 1.14 10.567 18.184
# The 1st 3 columns give the skewness and standardized fifth and sixth cumulants.
# "H_valid.skurt" gives the lower kurtosis boundary that produces a valid power method pdf
# using Headrick's approximation, with the kurtosis addition given in the "H_skurt_corr"
# column if necessary.
# "F_valid.skurt" gives the lower kurtosis boundary that produces a valid power method pdf
# using Fleishman's approximation, with the kurtosis addition given in the "F_skurt_corr"
# column if necessary.
# "H_invalid.skurt" gives the lower kurtosis boundary that produces an invalid power method
# pdf using Headrick's approximation, without the use of a kurtosis correction.
# "F_valid.skurt" gives the lower kurtosis boundary that produces an invalid power method
# pdf using Fleishman's approximation, without the use of a kurtosis correction.
# "PBON_skurt" gives the lower kurtosis boundary approximation used in the PoisBinOrdNonNor
# package.
# "H_time" gives the computation time (minutes) for Headrick's method.
# "F_time" gives the computation time (minutes) for Fleishman's method.
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.