offsetX: Offset data using quasirandom noise to avoid overplotting In vipor: Plot Categorical Data Using Quasirandom Noise and Density Estimates

Description

Arranges data points using quasirandom noise (van der Corput sequence), pseudorandom noise or alternatively positioning extreme values within a band to the left and right to form beeswarm/one-dimensional scatter/strip chart style plots. That is a plot resembling a cross between a violin plot (showing the density distribution) and a scatter plot (showing the individual points). This function returns a vector of the offsets to be used in plotting.

Usage

 ```1 2 3 4 5``` ```offsetX(y, x = rep(1, length(y)), width = 0.4, varwidth = FALSE, ...) offsetSingleGroup(y, maxLength = NULL, method = c("quasirandom", "pseudorandom", "smiley", "maxout", "frowney", "minout", "tukey", "tukeyDense"), nbins = NULL, adjust = 1) ```

Arguments

 `y` vector of data points `x` a grouping factor for y (optional) `width` the maximum spacing away from center for each group of points. Since points are spaced to left and right, the maximum width of the cluster will be approximately width*2 (0 = no offset, default = 0.4) `varwidth` adjust the width of each group based on the number of points in the group `...` additional arguments to offsetSingleGroup `maxLength` multiply the offset by sqrt(length(y)/maxLength) if not NULL. The sqrt is to match boxplot (allows comparison of order of magnitude different ns, scale with standard error) `method` method used to distribute the points: quasirandom:points are distributed within a kernel density estimate of the distribution with offset determined by quasirandom Van der Corput noise pseudorandom:points are distributed within a kernel density estimate of the distribution with offset determined by pseudorandom noise a la jitter maxout:points are distributed within a kernel density with points in a band distributed with highest value points on the outside and lowest in the middle minout:points are distributed within a kernel density with points in a band distributed with highest value points in the middle and lowest on the outside tukey:points are distributed as described in Tukey and Tukey "Strips displaying empirical distributions: I. textured dot strips" tukeyDense:points are distributed as described in Tukey and Tukey but are constrained with the kernel density estimate `nbins` the number of points used to calculate density (defaults to 1000 for quasirandom and pseudorandom and 100 for others) `adjust` adjust the bandwidth used to calculate the kernel density (smaller values mean tighter fit, larger values looser fit, default is 1)

Value

a vector with of x-offsets of the same length as y

Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21``` ```## Generate fake data dat <- list(rnorm(50), rnorm(500), c(rnorm(100), rnorm(100,5)), rcauchy(100)) names(dat) <- c("Normal", "Dense Normal", "Bimodal", "Extremes") ## Plot each distribution with a variety of parameters par(mfrow=c(4,1), mar=c(2,4, 0.5, 0.5)) sapply(names(dat),function(label) { y<-dat[[label]] offsets <- list( 'Default'=offsetX(y), 'Smoother'=offsetX(y, adjust=2), 'Tighter'=offsetX(y, adjust=0.1), 'Thinner'=offsetX(y, width=0.1) ) ids <- rep(1:length(offsets), sapply(offsets,length)) plot(unlist(offsets) + ids, rep(y, length(offsets)), ylab=label, xlab='', xaxt='n', pch=21, las=1) axis(1, 1:4, c("Default", "Adjust=2", "Adjust=0.1", "Width=10%")) }) ```

vipor documentation built on May 29, 2017, 9:38 a.m.