| cnorm.shash | R Documentation |
This function fits a Sinh-Arcsinh (shash; Jones & Pewsey, 2009) regression model for continuous norm score modeling, where the distribution parameters vary smoothly as polynomial functions of age or other predictors. The shash distribution is well-suited for psychometric data as it can flexibly model skewness and tail weight independently, making it ideal for handling floor effects, ceiling effects, and varying degrees of individual differences across age groups. In a simulation study (Lenhard et al, 2019), the shash model demonstrated superior performance compared to other parametric approaches from the Box Cox family of functions. In contrast to Box Cox, Sinh-Arcsinh can model distributions including zero and negativ values.
cnorm.shash(
age,
score,
weights = NULL,
mu_degree = 3,
sigma_degree = 2,
epsilon_degree = 2,
delta_degree = 1,
delta = 1,
control = NULL,
scale = "T",
plot = TRUE
)
age |
A numeric vector of predictor values (typically age, but can be any continuous predictor). |
score |
A numeric vector of response values (raw test scores). Must be the same length as age. The value range is unresticted and it can include zeros and negative values. |
weights |
An optional numeric vector of weights for each observation. Useful for incorporating sampling weights. If NULL (default), all observations are weighted equally. |
mu_degree |
Integer specifying the degree of the polynomial for modeling the location parameter mu(age). Default is 3. Higher degrees allow more flexible modeling of how the central tendency changes with age, but may lead to overfitting with small samples. Common choices:
|
sigma_degree |
Integer specifying the degree of the polynomial for modeling the scale parameter sigma(age). Default is 2. This controls how the variability (spread) of scores changes with age. Lower degrees are often sufficient as variability typically changes more smoothly than location. |
epsilon_degree |
Integer specifying the degree of the polynomial for modeling the skewness parameter epsilon(age). Default is 2. This controls how the asymmetry of the distribution changes with age. |
delta_degree |
Integer specifying the plynomial for modelling the tail weight parameter delte(age). Default is 1. The tail weight can be fixed as well in case of numerical instability. In that case, set 'delta_degree' to NULL and specify a value for delta instead. Recommendation: Keep delta_degree low to avoid overfitting. |
delta |
Fixed tail weight parameter (must be > 0). Default is 1. This parameter controls the heaviness of the distribution tails and is kept constant across all ages in this implementation. It is only used, if 'delta_degree' is set to NULL. Common values:
|
control |
An optional list of control parameters passed to the
Increase |
scale |
Character string or numeric vector specifying the type of norm scale for output. This affects the scaling of derived norm scores but does not influence model fitting:
|
plot |
Logical indicating whether to automatically display a diagnostic plot of the fitted model. Default is TRUE. |
This implementation uses the Jones & Pewsey (2009) parameterization of the Sinh-Arcsinh distribution. Parameters are estimated using maximum likelihood via the L-BFGS-B algorithm. In case, optimization fails, try reducing model complexity by reducing polynomial degrees or fixing the delta parameter.
The shash distribution is defined by the transformation:
X = \mu + \sigma \cdot \sinh\left(\frac{\text{arcsinh}(Y) - \epsilon}{\delta}\right)
where Y is a standard normal variable, Y ~ N(0,1).
This transformation provides:
mu: Location parameter (similar to mean)
sigma: Scale parameter (similar to standard deviation)
epsilon: Skewness parameter (epsilon = 0 for symmetry)
delta: Tail weight parameter (delta = 1 for normal-like tails)
Choose polynomial degrees based on:
Sample size (higher degrees need more data)
Theoretical expectations about developmental trajectories
Model comparison criteria (AIC, BIC)
Visual inspection of fitted curves
For most applications, mu_degree = 3, sigma_degree = 2, epsilon_degree = 2, delta_degree = 1 provides a good balance of flexibility and parsimony.
An object of class "cnormShash" containing the fitted model results. This is a list with components:
mu_est |
Numeric vector of estimated coefficients for the location parameter mu(age). The first coefficient is the intercept, subsequent coefficients correspond to polynomial terms. |
sigma_est |
Numeric vector of estimated coefficients for the scale parameter log(sigma(age)). Note: These are coefficients for log(sigma) to ensure sigma > 0. |
epsilon_est |
Numeric vector of estimated coefficients for the skewness parameter epsilon(age). |
delta |
The fixed tail weight parameter value used in fitting. |
delta_est |
Numeric vector of estimated coefficients for the tail weight parameter delta(age) - in case, a degree has been set. |
se |
Numeric vector of standard errors for all estimated coefficients (if Hessian computation succeeds). |
mu_degree, sigma_degree, epsilon_degree |
The polynomial degrees used for each parameter. |
result |
Complete output from the |
The function requires the input data to have sufficient variability. Very small datasets or datasets with little age spread may cause convergence problems.
Polynomial models can exhibit edge effects at the boundaries of the age range. Predictions outside the observed age range should be made cautiously.
If convergence fails, try: (1) reducing polynomial degrees, (2) adjusting the delta parameter, (3) providing custom control parameters, or (4) checking for data quality issues.
The tail weight parameter delta is fixed across ages by default. For applications where tail behavior changes substantially with age, consider setting the delta_degree paramerer to 1 or 2.
Wolfgang Lenhard
Jones, M. C., & Pewsey, A. (2009). Sinh-arcsinh distributions. *Biometrika*, 96(4), 761-780.
Lenhard, A., Lenhard, W., Gary, S. (2019). Continuous norming of psychometric tests: A simulation study of parametric and semi-parametric approaches. *PLoS ONE*, 14(9), e0222279. https://doi.org/10.1371/journal.pone.0222279
plot for plotting fitted models,
predict for generating predictions,
cnorm.betabinomial2 for discrete beta-binomial alternative
## Not run:
# Basic usage with default settings
model <- cnorm.shash(age = children$age, score = children$raw_score)
# Custom polynomial degrees for complex developmental pattern
model_complex <- cnorm.shash(
age = adolescents$age,
score = adolescents$vocabulary_score,
mu_degree = 4, # Complex mean trajectory
sigma_degree = 3, # Changing variability pattern
epsilon_degree = 2, # Skewness shifts
delta_degree = NULL, # set to NULL to activate fixed delta
delta = 1.3 # Slightly heavy tails
)
# With sampling weights
model_weighted <- cnorm.shash(
age = survey$age,
score = survey$score,
weights = survey$sample_weight
)
# Custom optimization control for difficult convergence
model_robust <- cnorm.shash(
age = mixed$age,
score = mixed$score,
control = list(factr = 1e-6, maxit = 2000),
delta = 1.5
)
# Compare model fit
compare(model, model_complex)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.