cnorm.shash: Fit a Sinh-Arcsinh (shash) Regression Model for Continuous...
In cNORM: Continuous Norming

cnorm.shash

R Documentation

Fit a Sinh-Arcsinh (shash) Regression Model for Continuous Norming

Description

This function fits a Sinh-Arcsinh (shash; Jones & Pewsey, 2009) regression model for continuous norm score modeling, where the distribution parameters vary smoothly as polynomial functions of age or other predictors. The shash distribution is well-suited for psychometric data as it can flexibly model skewness and tail weight independently, making it ideal for handling floor effects, ceiling effects, and varying degrees of individual differences across age groups. In a simulation study (Lenhard et al, 2019), the shash model demonstrated superior performance compared to other parametric approaches from the Box Cox family of functions. In contrast to Box Cox, Sinh-Arcsinh can model distributions including zero and negativ values.

Usage

cnorm.shash(
  age,
  score,
  weights = NULL,
  mu_degree = 3,
  sigma_degree = 2,
  epsilon_degree = 2,
  delta_degree = 1,
  delta = 1,
  control = NULL,
  scale = "T",
  plot = TRUE
)

Arguments

`age`	A numeric vector of predictor values (typically age, but can be any continuous predictor).
`score`	A numeric vector of response values (raw test scores). Must be the same length as age. The value range is unresticted and it can include zeros and negative values.
`weights`	An optional numeric vector of weights for each observation. Useful for incorporating sampling weights. If NULL (default), all observations are weighted equally.
`mu_degree`	Integer specifying the degree of the polynomial for modeling the location parameter mu(age). Default is 3. Higher degrees allow more flexible modeling of how the central tendency changes with age, but may lead to overfitting with small samples. Common choices: 1: Linear change with age 2: Quadratic change (allows one inflection point) 3: Cubic change (allows two inflection points, suitable for most developmental curves) 4+: Higher-order changes (use cautiously, mainly for large samples)
`sigma_degree`	Integer specifying the degree of the polynomial for modeling the scale parameter sigma(age). Default is 2. This controls how the variability (spread) of scores changes with age. Lower degrees are often sufficient as variability typically changes more smoothly than location.
`epsilon_degree`	Integer specifying the degree of the polynomial for modeling the skewness parameter epsilon(age). Default is 2. This controls how the asymmetry of the distribution changes with age.
`delta_degree`	Integer specifying the plynomial for modelling the tail weight parameter delte(age). Default is 1. The tail weight can be fixed as well in case of numerical instability. In that case, set 'delta_degree' to NULL and specify a value for delta instead. Recommendation: Keep delta_degree low to avoid overfitting.
`delta`	Fixed tail weight parameter (must be > 0). Default is 1. This parameter controls the heaviness of the distribution tails and is kept constant across all ages in this implementation. It is only used, if 'delta_degree' is set to NULL. Common values: delta = 1: Normal-like tail behavior (baseline) delta > 1: Heavier tails, higher kurtosis (more extreme scores than normal distribution) delta < 1: Lighter tails, lower kurtosis (fewer extreme scores than normal distribution)
`control`	An optional list of control parameters passed to the `optim` function for maximum likelihood estimation. If NULL, sensible defaults are chosen automatically based on the model complexity. Common parameters to adjust: `factr`: Controls precision of optimization (default: 1e-8) `maxit`: Maximum number of iterations (default: n_parameters * 200) `lmm`: Memory limit for L-BFGS-B (default: min(n_parameters, 20)) Increase `maxit` or decrease `factr` if optimization fails to converge.
`scale`	Character string or numeric vector specifying the type of norm scale for output. This affects the scaling of derived norm scores but does not influence model fitting: "T": T-scores (mean = 50, SD = 10) - default "IQ": IQ-like scores (mean = 100, SD = 15) "z": z-scores (mean = 0, SD = 1) c(M, SD): Custom scale with specified mean M and standard deviation SD
`plot`	Logical indicating whether to automatically display a diagnostic plot of the fitted model. Default is TRUE.

Details

This implementation uses the Jones & Pewsey (2009) parameterization of the Sinh-Arcsinh distribution. Parameters are estimated using maximum likelihood via the L-BFGS-B algorithm. In case, optimization fails, try reducing model complexity by reducing polynomial degrees or fixing the delta parameter.

The Sinh-Arcsinh Distribution

The shash distribution is defined by the transformation:

X = \mu + \sigma \cdot \sinh\left(\frac{\text{arcsinh}(Y) - \epsilon}{\delta}\right)

where Y is a standard normal variable, Y ~ N(0,1).

This transformation provides:

mu: Location parameter (similar to mean)
sigma: Scale parameter (similar to standard deviation)
epsilon: Skewness parameter (epsilon = 0 for symmetry)
delta: Tail weight parameter (delta = 1 for normal-like tails)

Model Selection

Choose polynomial degrees based on:

Sample size (higher degrees need more data)
Theoretical expectations about developmental trajectories
Model comparison criteria (AIC, BIC)
Visual inspection of fitted curves

For most applications, mu_degree = 3, sigma_degree = 2, epsilon_degree = 2, delta_degree = 1 provides a good balance of flexibility and parsimony.

Value

An object of class "cnormShash" containing the fitted model results. This is a list with components:

`mu_est`	Numeric vector of estimated coefficients for the location parameter mu(age). The first coefficient is the intercept, subsequent coefficients correspond to polynomial terms.
`sigma_est`	Numeric vector of estimated coefficients for the scale parameter log(sigma(age)). Note: These are coefficients for log(sigma) to ensure sigma > 0.
`epsilon_est`	Numeric vector of estimated coefficients for the skewness parameter epsilon(age).
`delta`	The fixed tail weight parameter value used in fitting.
`delta_est`	Numeric vector of estimated coefficients for the tail weight parameter delta(age) - in case, a degree has been set.
`se`	Numeric vector of standard errors for all estimated coefficients (if Hessian computation succeeds).
`mu_degree`, `sigma_degree`, `epsilon_degree`	The polynomial degrees used for each parameter.
`result`	Complete output from the `optim` function, including convergence information, log-likelihood value, and other optimization details.

Note

The function requires the input data to have sufficient variability. Very small datasets or datasets with little age spread may cause convergence problems.
Polynomial models can exhibit edge effects at the boundaries of the age range. Predictions outside the observed age range should be made cautiously.
If convergence fails, try: (1) reducing polynomial degrees, (2) adjusting the delta parameter, (3) providing custom control parameters, or (4) checking for data quality issues.
The tail weight parameter delta is fixed across ages by default. For applications where tail behavior changes substantially with age, consider setting the delta_degree paramerer to 1 or 2.

Author(s)

Wolfgang Lenhard

References

Jones, M. C., & Pewsey, A. (2009). Sinh-arcsinh distributions. *Biometrika*, 96(4), 761-780.

Lenhard, A., Lenhard, W., Gary, S. (2019). Continuous norming of psychometric tests: A simulation study of parametric and semi-parametric approaches. *PLoS ONE*, 14(9), e0222279. https://doi.org/10.1371/journal.pone.0222279

Examples

## Not run: 
# Basic usage with default settings
model <- cnorm.shash(age = children$age, score = children$raw_score)

# Custom polynomial degrees for complex developmental pattern
model_complex <- cnorm.shash(
  age = adolescents$age,
  score = adolescents$vocabulary_score,
  mu_degree = 4,         # Complex mean trajectory
  sigma_degree = 3,      # Changing variability pattern
  epsilon_degree = 2,    # Skewness shifts
  delta_degree = NULL, # set to NULL to activate fixed delta
  delta = 1.3            # Slightly heavy tails
)

# With sampling weights
model_weighted <- cnorm.shash(
  age = survey$age,
  score = survey$score,
  weights = survey$sample_weight
)

# Custom optimization control for difficult convergence
model_robust <- cnorm.shash(
  age = mixed$age,
  score = mixed$score,
  control = list(factr = 1e-6, maxit = 2000),
  delta = 1.5
)

# Compare model fit
compare(model, model_complex)

## End(Not run)

cNORM documentation built on Feb. 27, 2026, 1:07 a.m.

cNORM index

README.md Demonstration for Creating Continuous Norms with cNORM" Modelling Norms with the Beta-Binomial Distribution" Modelling Norms with the Sinh-Arcsinh (shash) Distribution" Weighted Regression-Based Norming"

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

cNORM
Continuous Norming

cnorm.shash: Fit a Sinh-Arcsinh (shash) Regression Model for Continuous...
In cNORM: Continuous Norming

Fit a Sinh-Arcsinh (shash) Regression Model for Continuous Norming

Description

Usage

Arguments

Details

The Sinh-Arcsinh Distribution

Model Selection

Value

Note

Author(s)

References

See Also

Examples

Related to cnorm.shash in cNORM...

R Package Documentation

Browse R Packages

We want your feedback!

cNORM Continuous Norming

cnorm.shash: Fit a Sinh-Arcsinh (shash) Regression Model for Continuous... In cNORM: Continuous Norming

Fit a Sinh-Arcsinh (shash) Regression Model for Continuous Norming

Description

Usage

Arguments

Details

The Sinh-Arcsinh Distribution

Model Selection

Value

Note

Author(s)

References

See Also

Examples

Related to cnorm.shash in cNORM...

R Package Documentation

Browse R Packages

We want your feedback!

cNORM
Continuous Norming

cnorm.shash: Fit a Sinh-Arcsinh (shash) Regression Model for Continuous...
In cNORM: Continuous Norming