Randomly Break Ties in Data

Share:

Description

This is a generic function intended to randomly break tied data in a way similar to what jitter does: tie-breaking is performed by shifting all data points by a random amount. The surveillance package defines methods for matrices, "epidataCS", and a default method for numeric vectors.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
untie(x, amount, ...)

## S3 method for class 'epidataCS'
untie(x, amount = list(t=NULL, s=NULL),
      minsep = list(t=0, s=0), direction = "left", keep.sources = FALSE,
      ..., verbose = FALSE)
## S3 method for class 'matrix'
untie(x, amount = NULL, minsep = 0,
      constraint = NULL, giveup = 1000, ...)
## Default S3 method:
untie(x, amount = NULL, minsep = 0,
      direction = c("symmetric", "left", "right"), sort = NULL,
      giveup = 1000, ...)

Arguments

x

the data to be untied.

amount

upper bound for the random amount by which data are shifted. NULL means to use a data-driven default, which equals the minimum separation of the data points for the non-symmetric default method and its half for the symmetric default method and the matrix method.

For numeric vectors (default method), the jittered version is the same as for jitter(x, amount=amount) if direction="symmetric" (and amount is non-NULL), and x “+-” runif(length(x), 0, amount) (otherwise).
For matrices, a vector uniformly drawn from the disc with radius amount is added to each point (row).
For "epidataCS", amount is a list stating the amounts for the temporal and/or spatial dimension, respectively. It then uses the specific methods with arguments constraint=x$W, direction, and sort=TRUE.

minsep

minimum separation of jittered points. Can only be obeyed if much smaller than amount (also depending on the number of points). minsep>0 is currently only implemented for the spatial (matrix) method.

keep.sources

logical (FALSE). If TRUE, the original list of possible event sources in x$events$.sources will be preserved. For instance, events observed at the same time did by definition not trigger each other; however, after random tie-breaking one event will precede the other and considered as a potential source of infection for the latter, although it could just as well be the other way round. Enabling keep.sources will use the .sources list from the original (tied) "epidataCS" object. Note, however, that an update is forced within twinstim if a subset of the data is selected for model fitting or if a different qmatrix is supplied.

constraint

an object of class "SpatialPolygons" representing the domain which the points of the matrix should belong to – before and after jittering.

giveup

number of attempts after which the algorithm should stop trying to generate new points.

direction

one of "symmetric" (default), "left", or "right", indicating in which direction vector elements should be shifted.

sort

logical indicating if the jittered vector should be sorted. Defaults to doing so if the original vector was already sorted.

...

For the "epidataCS"-method: arguments passed to the matrix- or default-method (giveup). Unused in other methods.

verbose

logical passed to as.epidataCS.

Value

the untied (jittered) data.

Author(s)

Sebastian Meyer

See Also

jitter

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# vector example
set.seed(123)
untie(c(rep(1,3), rep(1.2, 4), rep(3,3)), direction="left", sort=FALSE)

# spatial example
data(imdepi)
coords <- coordinates(imdepi$events)
table(duplicated(coords))
plot(coords, cex=sqrt(multiplicity(coords)))
set.seed(1)
coords_untied <- untie(coords)
stopifnot(!anyDuplicated(coords_untied))
points(coords_untied, col=2) # shifted by very small amount in this case

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.