sdac: Autologous Peripheral Blood Stem Cell Transplants Data

sdacR Documentation

Autologous Peripheral Blood Stem Cell Transplants Data

Description

Data on Autologous Peripheral Blood Stem Cell Transplants from the Stem Cell Lab in the Cross Cancer Institute, Alberta Health Services. The dataset examines recovery rates of CD34+ cells after peripheral blood stem cell (PBSC) transplants.

Usage

sdac

Format

A data frame with 60 observations on 5 variables:

rcd

numeric. Recovery rate of CD34+ cells (proportion in (0, 1)). Response variable measuring the proportion of CD34+ cells recovered after PBSC transplant.

age

numeric. Patient age in years (range: 18-71 years).

ageadj

numeric. Age-adjusted covariate. Centered and scaled version of age for improved numerical stability in regression models.

chemo

factor. Type of chemotherapy protocol used for stem cell mobilization. Levels include: 1-day, 3-day, ⁠G-CSF only⁠, and other.

gender

factor. Patient gender. Most patients in the study are male.

Details

This dataset contains clinical data from autologous peripheral blood stem cell (PBSC) transplant patients treated at the Cross Cancer Institute, Alberta Health Services. CD34+ cells are hematopoietic stem and progenitor cells critical for successful transplantation and hematopoietic recovery.

Clinical context: Autologous PBSC transplantation is used to treat various hematological malignancies including multiple myeloma, non-Hodgkin's lymphoma, acute leukemia, and some solid tumors. The recovery rate of CD34+ cells is a crucial predictor of engraftment success and patient outcomes.

Chemotherapy protocols:

  • 1-day protocol: Single-day high-dose chemotherapy for mobilization

  • 3-day protocol: Multi-day chemotherapy regimen

  • G-CSF only: Granulocyte colony-stimulating factor without chemotherapy

  • Other: Alternative or combined protocols

The proportion of recovered CD34+ cells naturally falls in the interval (0, 1), making it ideal for proportional data regression modeling. Age effects are particularly important as older patients may show different recovery patterns.

This dataset is particularly suitable for:

  • Simplex regression (original application by Zhang et al. 2016)

  • Beta regression with variable dispersion

  • Kumaraswamy regression for flexible distributional modeling

Source

Stem Cell Lab, Cross Cancer Institute, Alberta Health Services, Canada.

References

Zhang, P., Qiu, Z., and Shi, C. (2016). simplexreg: An R Package for Regression Analysis of Proportional Data Using the Simplex Distribution. Journal of Statistical Software, 71(11), 1–21. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v071.i11")}

Examples


require(gkwreg)
require(gkwdist)

data(sdac)

# Example 1: Basic Kumaraswamy regression
# Mean recovery depends on age and chemotherapy protocol
# Precision varies with age (older patients more variable)
fit_kw <- gkwreg(
  rcd ~ ageadj + chemo |
    age,
  data = sdac,
  family = "kw"
)
summary(fit_kw)

# Interpretation:
# - Alpha (mean recovery): Depends on age-adjusted covariate and chemo protocol
#   Different protocols show different baseline recovery rates
#   G-CSF-only may differ from multi-day chemotherapy protocols
# - Beta (precision): Raw age affects recovery variability
#   Hypothesis: Older patients show more heterogeneous responses

# Example 2: Include gender effects
# Gender may influence stem cell recovery rates
fit_kw_gender <- gkwreg(
  rcd ~ ageadj + chemo + gender |
    age + gender,
  data = sdac,
  family = "kw"
)
summary(fit_kw_gender)

# Interpretation:
# - Gender effects in both mean and precision
# - Precision may differ between males and females

# Test gender significance
anova(fit_kw, fit_kw_gender)

# Example 3: Exponentiated Kumaraswamy for extreme recovery patterns
# Some patients show unusually high or low recovery (outliers)
# Lambda parameter captures tail heaviness
fit_ekw <- gkwreg(
  rcd ~ ageadj + chemo + gender | # alpha: mean model
    age + chemo | # beta: precision varies with age and protocol
    chemo, # lambda: protocol affects extremity
  data = sdac,
  family = "ekw"
)
summary(fit_ekw)

# Clinical interpretation:
# - Lambda varies by chemotherapy protocol: Some protocols produce more
#   extreme recovery patterns (very high or very low CD34+ counts)
# - G-CSF-only vs multi-day protocols may differ in tail behavior
# - Important for risk stratification and clinical decision-making

# Test if extreme patterns differ by protocol
anova(fit_kw_gender, fit_ekw)

# Example 4: Interaction between age and protocol
# Protocol effectiveness may vary with patient age
fit_kw_interact <- gkwreg(
  rcd ~ ageadj * chemo |
    age * chemo,
  data = sdac,
  family = "kw"
)
summary(fit_kw_interact)

# Interpretation:
# - Interaction: Does protocol effectiveness decline with age?
# - Critical for personalized treatment selection


gkwreg documentation built on Nov. 27, 2025, 5:06 p.m.