| ReadingSkills | R Documentation |
Data for assessing the contribution of non-verbal IQ to children's reading skills in dyslexic and non-dyslexic children. This is a classic dataset demonstrating beta regression with interaction effects and heteroscedasticity.
ReadingSkills
A data frame with 44 observations on 4 variables:
numeric. Reading accuracy score scaled to the open unit interval (0, 1). Perfect scores of 1 were replaced with 0.99.
numeric. Unrestricted reading accuracy score in (0, 1), including boundary observations.
factor. Is the child dyslexic? Levels: no (control group)
and yes (dyslexic group). Sum contrast coding is employed.
numeric. Non-verbal intelligence quotient transformed to z-scores (mean = 0, SD = 1).
The data were collected by Pammer and Kevan (2004) and employed by Smithson and Verkuilen (2006) in their seminal beta regression paper. The sample includes 19 dyslexic children and 25 controls recruited from primary schools in the Australian Capital Territory. Children's ages ranged from 8 years 5 months to 12 years 3 months.
Mean reading accuracy was 0.606 for dyslexic readers and 0.900 for controls. The study investigates whether dyslexia contributes to reading accuracy even when controlling for IQ (which is on average lower for dyslexics).
Transformation details: The original reading accuracy score was transformed by Smithson and Verkuilen (2006) to fit beta regression requirements:
First, the original accuracy was scaled using the minimal and maximal scores
(a and b) that can be obtained in the test: accuracy1 = (original - a)/(b - a)
(a and b values are not provided).
Subsequently, accuracy was obtained from accuracy1 by replacing all
observations with a value of 1 with 0.99 to fit the open interval (0, 1).
The data clearly show asymmetry and heteroscedasticity (especially in the control group), making beta regression more appropriate than standard linear regression.
Data collected by Pammer and Kevan (2004).
Cribari-Neto, F., and Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1–24. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v034.i02")}
Grün, B., Kosmidis, I., and Zeileis, A. (2012). Extended Beta Regression in R: Shaken, Stirred, Mixed, and Partitioned. Journal of Statistical Software, 48(11), 1–25. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v048.i11")}
Kosmidis, I., and Zeileis, A. (2024). Extended-Support Beta Regression for (0, 1) Responses. arXiv:2409.07233. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2409.07233")}
Pammer, K., and Kevan, A. (2004). The Contribution of Visual Sensitivity, Phonological Processing and Nonverbal IQ to Children's Reading. Unpublished manuscript, The Australian National University, Canberra.
Smithson, M., and Verkuilen, J. (2006). A Better Lemon Squeezer? Maximum-Likelihood Regression with Beta-Distributed Dependent Variables. Psychological Methods, 11(1), 54–71.
require(gkwreg)
require(gkwdist)
data(ReadingSkills)
# Example 1: Standard Kumaraswamy with interaction and heteroscedasticity
# Mean: Dyslexia × IQ interaction (do groups differ in IQ effect?)
# Precision: Main effects (variability differs by group and IQ level)
fit_kw <- gkwreg(
accuracy ~ dyslexia * iq |
dyslexia + iq,
data = ReadingSkills,
family = "kw",
control = gkw_control(method = "L-BFGS-B", maxit = 2000)
)
summary(fit_kw)
# Interpretation:
# - Alpha (mean): Interaction shows dyslexic children benefit less from
# higher IQ compared to controls
# - Beta (precision): Controls show more variable accuracy (higher precision)
# IQ increases consistency of performance
# Example 2: Simpler model without interaction
fit_kw_simple <- gkwreg(
accuracy ~ dyslexia + iq |
dyslexia + iq,
data = ReadingSkills,
family = "kw",
control = gkw_control(method = "L-BFGS-B", maxit = 2000)
)
# Test if interaction is significant
anova(fit_kw_simple, fit_kw)
# Example 3: Exponentiated Kumaraswamy for ceiling effects
# Reading accuracy often shows ceiling effects (many perfect/near-perfect scores)
# Lambda parameter can model this right-skewed asymmetry
fit_ekw <- gkwreg(
accuracy ~ dyslexia * iq | # alpha
dyslexia + iq | # beta
dyslexia, # lambda: ceiling effect by group
data = ReadingSkills,
family = "ekw",
control = gkw_control(method = "L-BFGS-B", maxit = 2000)
)
summary(fit_ekw)
# Interpretation:
# - Lambda varies by dyslexia status: Controls have stronger ceiling effect
# (more compression at high accuracy) than dyslexic children
# Test if ceiling effect modeling improves fit
anova(fit_kw, fit_ekw)
# Example 4: McDonald distribution alternative
# Provides different parameterization for extreme values
fit_mc <- gkwreg(
accuracy ~ dyslexia * iq | # gamma
dyslexia + iq | # delta
dyslexia * iq, # lambda: interaction affects tails
data = ReadingSkills,
family = "mc",
control = gkw_control(method = "L-BFGS-B", maxit = 2000)
)
summary(fit_mc)
# Compare 3-parameter models
AIC(fit_ekw, fit_mc)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.