plotgroups: Plot several groups of repeated observations.
In ilia-kats/imisc: miscellaneous functions

Description Usage Arguments Details Value Examples

Plot several groups of repeated observations, e.g. abundance/half-life of several proteins each observed in several cell lines in several replicates. Observations can be grouped either by protein (in which case cell lines will be annotated as X axis labels and proteins above the plot) or by cell line. Related parameters can be plotted in separate plots below each other, sharing the groupings and annotations (see examples)

plotgroups(data, names, colors = NULL, legend.text = NULL,
  legend.col = NULL, legend.pars = list(), legend.lwd = NULL,
  groups.spacing = 0, names.split = NULL, names.italicize = NULL,
  names.style = c("plain", "combinatorial"), names.pch.cex = 1,
  names.pch = 19, names.adj = NA, names.map.fun = NULL,
  names.margin = 0.5, names.rotate = NULL, names.placeholder = NA,
  features = NA, log = FALSE, range = 1.5, conf.level = 0.95,
  ci.fun = plotgroups.ci, cex.xlab = 1, ylim = NULL,
  legendmargin = NULL, plot.type = plotgroups.boxplot,
  plot.fun.pars = list(), barwidth = 0.8, main = NULL, ylab = NULL,
  ylab.line = NULL, signif.test = NULL, signif.test.fun = t.test,
  signif.test.text = plotgroups.pval, signif.test.col = "black",
  signif.test.lwd = legend.lwd, signif.test.pars = legend.pars,
  extrafun.before = NULL, extrafun.after = NULL, ...)

`data`	list, each element is a vector of replicates for one combination of parameters, or each element is a list containing a vector of replicates, in which case the data sets will be plotted below each other in separate plots
`names`	character vector of X axis labels
`colors`	colors for plotting
`legend.text`	character vector of the same length as `data` giving the group names. A group of observations is identified by consecutive occurrence of the same name.
`legend.col`	colors for group annotations. Defaults to plotting colors
`legend.pars`	parameters for group annotation. Will be passed to `text`
`legend.lwd`	line width for grouping annotations. Defaults to `par("lwd")`
`groups.spacing`	extra space between the groups in user coordinates.
`names.split`	character by which to split the `names`. Only useful in combination with `names.italicize` or `names.style='combinatorial'`
`names.italicize`	if a part of a `name` is to be written in italic text, the part is identified by this character. I.e. The name is first split by `names.split`, each fragment containing `names.italicize` is rendered in italics
`names.style`	how the `names` are to be rendered. plain each name will be written as-is below the plot combinatorial names will be split by `names.split`, unique strings will be printed at the bottom-left, and observations whose name contains the string will be identified by printing `names.pch` below the respective bar. Useful if e.g. assaying different combinations of single/double/triple knock-outs.
`names.pch.cex`	character expansion factor for `names.pch`
`names.pch`	character to be used for annotation of observations when `names.style='combinatorial'`
`names.adj`	text adjustment for `names` or `names.pch`, depending on `names.style`.See `text`. Defaults to 1 for `names.style = 'plain'`, unless `names.rotate = 0`, in which case it defaults to 0.5. Defaults to 0.5 for `names.style='combinatorial'`.
`names.map.fun`	Function mapping between names string and pch/cex/adj/rotate for the respective combination. Useful for more complicated experimental layouts where different names.pch must be used for different genes, see examples. Must accept six arguments: n String with the names combination to process split Default pattern to split by, as given by `names.split` pch Default pch, as given by `names.pch` cex Default cex, as given by `names.pch.cex` rotate Default rotate, as given by `names.rotate` adj Default adj, as given by `names.adj` Must return a named list, with the names being the split genes that should be used to label the rows, each element being itself a named list containing the plotting parameters for that particular annotation, i.e. `pch`, `cex`, `rotate`, `adj`.
`names.margin`	spacing between the bottom edge of the plot and the annotation, in inches
`names.rotate`	Degrees by which to rotate the annotation strings.
`names.placeholder`	Only used when `names.style='combinatorial'`. Placeholder character to use when no annotation is present for the current sample and row. See examples.
`features`	which features of the sample distributions to plot. Availability of features depends on `plot.type` Can contain any combination of the following: median the median box the first and third quartiles iqr the most extreme data point no more than `range` times the interquartile range away from the `box` mean the mean sd mean \pm standard deviation sem mean \pm standard error of the mean ci confidence interval at `conf.level` Can be a list containing character vectors, in which case the specified feature set will apply to the corresponding plot if multiple data sets are plotted (see examples). Will be recycled to the number of plots.
`log`	Whether to plot the Y axis on log scale
`range`	determines how far the the `iqr` whiskers will extend out from the box, if they are to be plotted. Will be recycled to the number of plots.
`conf.level`	Confidence level for plotting of confidence intervals. Will be recycled to the number of plots
`ci.fun`	Function to compute confidence intervals. Will be recycled to the number of plots. Must accept five arguments: data Numeric vector containing data for one group mean Precomputed mean of the sample se Precomputed standard error of the mean of the sample ndata Number of observations conf.level Confidence level If `data` is given, `mean`, `se`, and `ndata` are not used, but calculated from the data. If `data` is omitted, all of `mean`, `se`, and `ndata` must be given. Defaults to `plotgroups.ci`, which computes confidence intervals using the t statistics. Must return a numeric vector of length 2, containing the lower and upper confidence bounds.
`cex.xlab`	character expansion factor for X axis annotation
`ylim`	Y axis limits. Will be determined automatically if `NULL`. If not `NULL` but only one limit is finite, the other will be determined automatically. Can be a list containing numeric vectors, in which case the limits will apply to the corresponding plot if multiple data sets are plotted. Will be recycled to the number of plots.
`legendmargin`	spacing between the upper-most data point/feature and the upper edge of the plot, required for group annotation. Will be determined automatically if `NULL`
`plot.type`	list containint three functions: plot function to do the actual plotting. See `plotgroups.boxplot`, `plotgroups.beeswarm`, `plotgroups.barplot`, `plotgroups.vioplot`. ylim Function to calculate Y axis limits based on data and features. Takes three arguments: data List of numeric vectors with data stats Precomputed statistics features Features to plot Returns either a 2-element vector with Y limits or `NULL`, in which case Y limits will be computed based on sensible defaults. features Function to check user-supplied feature lists for correctness and compute default features, if necessary. Takes one argument (the user-supplied feature character vector) and returns a character vector with features to plot. Can be a list of lists, in which case the elements will apply to the corresponding plot.
`plot.fun.pars`	additional parameters to pass to `plot.type$plot`
`barwidth`	width of the individual bars/boxes etc. as fraction of 1
`main`	main title
`ylab`	Y axis label. Will be recycled to the number of plots.
`ylab.line`	The margin line for the Y axis label.
`signif.test`	list of 2-element integer vectors giving the elements of `data` to be tested for significant differences. Can be a list of lists, in which case each element will apply to the corresponding plot if multiple data sets are plotted.
`signif.test.fun`	function to perform the significance testing. Must accept 2 vectors and return a list containing at least the element `p.value`. Can be a list of functions, in which case each element will apply to the corresponding plot if multiple data sets are plotted.
`signif.test.text`	function accepting a p-value and returning a formatted string to be used for plotting or `NULL` if this p-value is not to be plotted (e.g. if it is not significant). Can be a list of functions, in which case each element will apply to the corresponding plot if multiple data sets are plotted.
`signif.test.col`	color of p-value annotations.
`signif.test.lwd`	line width for p-value annotations. Can be a list, in which case the lwd will apply to the corresponding plot if multiple data sets are plotted.
`signif.test.pars`	parameters for group annotation. Will be passed to `text`. Can be a list of lists, in which case each element will apply to the corresponding plot if multiple data sets are plotted.
`extrafun.before`	additional function to call after the coordinate system has been set up, but before plotting, e.g. to add a background grid to the plot. Can be a list of functions, in which case each element will apply to the corresponding plot if multiple data sets are plotted.
`extrafun.after`	additional function to call after plotting, e.g. to add additional elements to the plot. Can be a list of functions, in which case each element will apply to the corresponding plot if multiple data sets are plotted.
`...`	additional parameters passed to `par`

This is a wrapper function around plot.type$plot. It sets up the coordinate system, calls extrafun.before followed by plot.type$plot, which does the actual plotting, and extrafun.after. All three functions are passed the following arguments:

data

the data argument passed to plotgroups

at

X coordinates of the data. Particularly important when groups.spacing != 0

stats

summary statistics of the data. List with the following components:

means: means
sds: standard deviations
sems: standard errors of the mean
medians: medians
boxmax: third quartile
boxmin: tirst quartile
iqrmax: the maximal data point within range times the interquartile range of boxmax
iqrmin: the minimal data point within range times the interquartile range of boxmin
cimax: the upper confidence bound, computed by ci.fun according to conf.level
cimin: the lower confidence bound, computed by ci.fun according to conf.level
range: the range of the extreme data points within [iqrmin, iqrmax]
conf.level: the confidence level at which cimax, cimin apply

colors

the colors argument passed to plotgroups

features

the features argument passed to plotgroups

barwidth

the barwidth argument passed to plotgroups

plot.type$plot is additionally passed the arguments given by plot.fun.pars.

Significance testing is performed by calling signif.test.fun with two vector arguments containing the samples to be compared. signif.test.fun must return a list containing a p.value element. The p value is passed as single argument to signif.test.text which returns a character vector (or anything usable by text).

list with the following components:

stats: summary statistics of the data.
features: Character vector of features actually plotted.
plotfun: Return value of plot.type$plot
at: X coordinates of the data.
annotation.height: Height of the annotation in inches.
annotation.width: Width of the annotation in inches. If names.style='combinatorial' this is the width of the left margin.
legendmargin: Top margin required for the legend, in user coordinates.

If significance testing was performed, also contains a component signiftest, which is a list with elements ordered by signif.test with the following components:

test: return value of the testing function
label: return value of signif.test.text

data <- list()
for (i in 1:14) data[[i]] <- rnorm(50, i, 0.5)
names <- rep(c('gene1', 'gene2', 'gene3', 'gene1 gene2', 'gene1 gene3', 'gene2 gene3', 'gene1 gene2 gene3'),
     times=2)
names2 <- as.character(rep(1:7,times=2))
names2[2] <- "abc\nefg"
colors <- c("green", "blue")
legend.text <- rep(c("protein1", "protein2"), each=7)
plotgroups(data, names, colors, legend.text,
           plot.type=plotgroups.beeswarm, features=c('mean', 'sd'), ylim=c(0,Inf))
plotgroups(data, names2, colors, legend.text,plot.type=plotgroups.vioplot, ylim=c(0,Inf),
           names.rotate=0, names.adj=c(0.5, 1))
plotgroups(data, names, colors, legend.text, log=TRUE,
           plot.type=plotgroups.beeswarm, features=c('mean', 'sd'),
           names.style='combinatorial', names.split=" ", names.pch='\u0394',
           plot.fun.pars=list(palpha=0.5, bxpcols="black"))
plotgroups(data, names, colors, legend.text,
           names.style='combinatorial', names.split=" ", names.pch='\u0394',
           names.placeholder='+')
plotgroups(data, names, colors, legend.text,
           names.style='combinatorial', names.split=" ", names.pch=19,
           main="test", plot.type=plotgroups.barplot, features=c("mean", "sd"),
           plot.fun.pars=list(whiskerswidth=0.6))

map.fun <- function(n, split, pch, cex, rotate, adj) {
               n <- strsplit(n, split, fixed=TRUE)[[1]]
               nlist <- lapply(n, function(x){
                                      if (x != "gene2") {
                                          list(pch=pch, cex=cex, rotate=rotate, adj=adj)
                                       } else {
                                           list(pch='S158T', cex=cex, rotate=90, adj=c(0,0.5))
                                       }
                                    })
                names(nlist) <- n
                nlist
}
plotgroups(data, names, colors, legend.text,names.style='combinatorial', names.split=" ",
           names.pch='\u0394', names.map.fun=map.fun)
## significance testing
plotgroups(data, names, colors, legend.text,names.style='combinatorial',
           names.split=" ", names.pch='\u0394',
           signif.test=list(c(1,3), c(2,5), c(5,8), c(3,10)))
plotgroups(data, names, colors, legend.text,names.style='combinatorial',
           names.split=" ", names.pch='\u0394',
           signif.test=list(c(1,3), c(2,5), c(5,8), c(3,10)),
           signif.test.text=function(p) {
                     if (p < 0.001) {
                         return('***')
                     } else if (p < 0.01) {
                         return('**')
                     } else if (p < 0.05) {
                         return('*')
                     } else {
                         return(NULL)
                     }})
## multiple plots
plotgroups(list(data, rev(data)), names, colors, legend.text,names.style='combinatorial',
           names.split=" ",names.pch='\u0394', names.map.fun=map.fun,
           ylim=c(0,Inf), ylab=c("data1", "data2"), main="test", features=list(NULL,
           c("median", "box")), plot.type=list(plotgroups.boxplot, plotgroups.beeswarm),
           signif.test=list(NULL,list(c(1,3), c(2,5), c(5,8), c(3,10))))