Simulated Grouped Hyper Data Frame

library(knitr)
opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
options(rmarkdown.html_vignette.check_title = FALSE)

Introduction

This vignette of package groupedHyperframe.random (Github, RPubs) documents the simulation of superimposed ppp.object and the groupedHyperframe object.

Note to Users

Examples in this vignette require that the search path has

library(groupedHyperframe.random)

Terms and Abbreviations

c(
  '', 'Forward pipe operator', '`?base::pipeOp` introduced in `R` 4.1.0', 
  '`CRAN`, `R`', 'The Comprehensive R Archive Network', 'https://cran.r-project.org',
  '`coords`', '$x$- and $y$-coordinates', '`spatstat.geom:::ppp`',
  '`diag`', 'Diagonal matrix', '`base::diag`',
  '`groupedHyperframe`', 'Grouped hyper data frame', '`groupedHyperframe::as.groupedHyperframe`',
  '`hypercolumns`, `hyperframe`', '(Hyper columns of) hyper data frame', '`spatstat.geom::hyperframe`',
  '`marks`, `marked`', '(Having) mark values', '`spatstat.geom::is.marked`',
  '`pmax`', 'Parallel maxima', '`base::pmax`',
  '`ppp`, `ppp.object`', 'Point pattern', '`spatstat.geom::ppp.object`',
  '`recycle`', 'Recycling', 'https://r4ds.had.co.nz/vectors.html#scalars-and-recycling-rules',  
  '`rlnorm`', 'Log normal random variable', '`stats::rlnorm`', 
  '`rMatClust`', 'Matern\'s cluster process', '`spatstat.random::rMatClust`',
  '`rmvnorm_`', 'Multivariate normal random variable', '`groupedHyperframe.random::rmvnorm_`; `MASS::mvrnorm`',
  '`rnbinom`', 'Negative binomial random variable', '`stats::rnbinom`',
  '`rpoispp`', 'Poisson point pattern', '`spatstat.random::rpoispp`',
#  '`rStrauss`', 'Strauss process', '`spatstat.random::rStrauss`',
  '`superimpose`', 'Superimpose', '`spatstat.geom::superimpose`',
  '`var`, `cor`, `cov`', 'Variance, correlation, covariance', '`stats::var`, `stats::cor`, `stats::cov`'
) |>
  matrix(nrow = 3L, dimnames = list(c('Term / Abbreviation', 'Description', 'Reference'), NULL)) |>
  t.default() |>
  as.data.frame.matrix() |> 
  kable()

Acknowledgement

This work is supported by NCI R01CA222847 (I. Chervoneva, T. Zhan, and H. Rui) and R01CA253977 (H. Rui and I. Chervoneva).

Simulated Point Pattern

Function .rppp() simulates superimposed ppp.objects with vectorized parameterization of random point pattern and distribution of marks.

Simulated unmarked Point Pattern

Example below simulates a coords-only, unmarked, two superimposed Matern's cluster processes $(\kappa, \mu, s) = (10,8,.15)$ and $(5,4,.06)$.

set.seed(125); r = .rppp(rMatClust(kappa = c(10, 5), mu = c(8, 4), scale = c(.15, .06)))
# plot(r) # suppressed for aesthetics

Simulated marked Point Pattern

Example below simulates two superimposed marked ppps,

set.seed(125); r1 = .rppp(
  rMatClust(kappa = c(10, 5), mu = c(8, 4), scale = c(.15, .06)), 
  rlnorm(meanlog = c(3, 5), sdlog = c(.4, .2)),
  rnbinom(size = 4, prob = .3) # shorter parameter recycled
)

Example below simulates two superimposed marked ppps,

set.seed(62); r2 = .rppp(
  rpoispp(lambda = c(3, 6)),
  rlnorm(meanlog = c(3, 5), sdlog = c(.4, .2)),
  rnbinom(size = c(4, 6), prob = c(.3, .1))
)

In the foreseeable future we will not support simulating more than one type of point patterns in a single call to function .rppp(). End user may manually superimpose different (marked) point patterns after simulating each of them separately.

spatstat.geom::superimpose(r1, r2)

Simulated groupedHyperframe

Now consider two superimposed Matern's cluster processes attached with a log-normal mark. The population parameters are

(p = data.frame(kappa = c(3,2), scale = c(.4,.2), mu = c(10,5), 
                meanlog = c(3,5), sdlog = c(.4,.2)))

We simulate for 3 subjects (e.g., patients). The subject-specific parameters deviate from the population parameters under a multivariate normal distribution with variance-covariance matrix $\Sigma$. The matrix $\Sigma$ may be specified by a numeric scalar, indicating all-equal diagonal variances and zero correlations/covariances. We also make sure that all subject-specific parameters satisfy that $\kappa>1$, $\mu>1$, $s>0$ for Matern's cluster processes, and $\sigma>0$ for log-normal distribution. Each matrix of the subject-specific parameters has the subjects on the rows, and the parameters of the ppps to be superimposed on the columns.

set.seed(39); (p. = rmvnorm_(n = 3L, mu = p, Sigma = list(
  kappa = .2^2, scale = .05^2, mu = .5^2, 
  meanlog = .1^2, sdlog = .01^2)) |> 
    within.list(expr = {
      kappa = pmax(kappa, 1 + .Machine$double.eps)
      mu = pmax(mu, 1 + .Machine$double.eps)
      scale = pmax(scale, .Machine$double.eps)
      sdlog = pmax(sdlog, .Machine$double.eps)
    }))

We simulate one to four ppps (e.g., medical images) per subject.

set.seed(37); (n = sample.int(n = 4L, size = 3L, replace = TRUE)) 

Function grouped_rppp() simulates a groupedHyperframe with a ppp-hypercolumn, and one-or-more columns of the grouping structure.

set.seed(76); (r = p. |> 
  with.default(expr = {
    grouped_rppp(
      rMatClust(kappa = kappa, scale = scale, mu = mu), 
      rlnorm(meanlog = meanlog, sdlog = sdlog),
      n = n
    )
  }))


Try the groupedHyperframe.random package in your browser

Any scripts or data that you put into this service are public.

groupedHyperframe.random documentation built on April 11, 2025, 6:14 p.m.