plotgroups: Plot several groups of repeated observations.

Description Usage Arguments Details Value Examples

View source: R/plotgroups.R

Description

Plot several groups of repeated observations, e.g. abundance/half-life of several proteins each observed in several cell lines in several replicates. Observations can be grouped either by protein (in which case cell lines will be annotated as X axis labels and proteins above the plot) or by cell line. Related parameters can be plotted in separate plots below each other, sharing the groupings and annotations (see examples)

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
plotgroups(data, names, colors = NULL, legend.text = NULL,
  legend.col = NULL, legend.pars = list(), legend.lwd = NULL,
  groups.spacing = 0, names.split = NULL, names.italicize = NULL,
  names.style = c("plain", "combinatorial"), names.pch.cex = 1,
  names.pch = 19, names.adj = NA, names.map.fun = NULL,
  names.margin = 0.5, names.rotate = NULL, names.placeholder = NA,
  features = NA, log = FALSE, range = 1.5, conf.level = 0.95,
  ci.fun = plotgroups.ci, cex.xlab = 1, ylim = NULL,
  legendmargin = NULL, plot.type = plotgroups.boxplot,
  plot.fun.pars = list(), barwidth = 0.8, main = NULL, ylab = NULL,
  ylab.line = NULL, signif.test = NULL, signif.test.fun = t.test,
  signif.test.text = plotgroups.pval, signif.test.col = "black",
  signif.test.lwd = legend.lwd, signif.test.pars = legend.pars,
  extrafun.before = NULL, extrafun.after = NULL, ...)

Arguments

data

list, each element is a vector of replicates for one combination of parameters, or each element is a list containing a vector of replicates, in which case the data sets will be plotted below each other in separate plots

names

character vector of X axis labels

colors

colors for plotting

legend.text

character vector of the same length as data giving the group names. A group of observations is identified by consecutive occurrence of the same name.

legend.col

colors for group annotations. Defaults to plotting colors

legend.pars

parameters for group annotation. Will be passed to text

legend.lwd

line width for grouping annotations. Defaults to par("lwd")

groups.spacing

extra space between the groups in user coordinates.

names.split

character by which to split the names. Only useful in combination with names.italicize or names.style='combinatorial'

names.italicize

if a part of a name is to be written in italic text, the part is identified by this character. I.e. The name is first split by names.split, each fragment containing names.italicize is rendered in italics

names.style

how the names are to be rendered.

plain

each name will be written as-is below the plot

combinatorial

names will be split by names.split, unique strings will be printed at the bottom-left, and observations whose name contains the string will be identified by printing names.pch below the respective bar. Useful if e.g. assaying different combinations of single/double/triple knock-outs.

names.pch.cex

character expansion factor for names.pch

names.pch

character to be used for annotation of observations when names.style='combinatorial'

names.adj

text adjustment for names or names.pch, depending on names.style.See text. Defaults to 1 for names.style = 'plain', unless names.rotate = 0, in which case it defaults to 0.5. Defaults to 0.5 for names.style='combinatorial'.

names.map.fun

Function mapping between names string and pch/cex/adj/rotate for the respective combination. Useful for more complicated experimental layouts where different names.pch must be used for different genes, see examples. Must accept six arguments:

n

String with the names combination to process

split

Default pattern to split by, as given by names.split

pch

Default pch, as given by names.pch

cex

Default cex, as given by names.pch.cex

rotate

Default rotate, as given by names.rotate

adj

Default adj, as given by names.adj

Must return a named list, with the names being the split genes that should be used to label the rows, each element being itself a named list containing the plotting parameters for that particular annotation, i.e. pch, cex, rotate, adj.

names.margin

spacing between the bottom edge of the plot and the annotation, in inches

names.rotate

Degrees by which to rotate the annotation strings.

names.placeholder

Only used when names.style='combinatorial'. Placeholder character to use when no annotation is present for the current sample and row. See examples.

features

which features of the sample distributions to plot. Availability of features depends on plot.type Can contain any combination of the following:

median

the median

box

the first and third quartiles

iqr

the most extreme data point no more than range times the interquartile range away from the box

mean

the mean

sd

mean \pm standard deviation

sem

mean \pm standard error of the mean

ci

confidence interval at conf.level

Can be a list containing character vectors, in which case the specified feature set will apply to the corresponding plot if multiple data sets are plotted (see examples). Will be recycled to the number of plots.

log

Whether to plot the Y axis on log scale

range

determines how far the the iqr whiskers will extend out from the box, if they are to be plotted. Will be recycled to the number of plots.

conf.level

Confidence level for plotting of confidence intervals. Will be recycled to the number of plots

ci.fun

Function to compute confidence intervals. Will be recycled to the number of plots. Must accept five arguments:

data

Numeric vector containing data for one group

mean

Precomputed mean of the sample

se

Precomputed standard error of the mean of the sample

ndata

Number of observations

conf.level

Confidence level

If data is given, mean, se, and ndata are not used, but calculated from the data. If data is omitted, all of mean, se, and ndata must be given. Defaults to plotgroups.ci, which computes confidence intervals using the t statistics. Must return a numeric vector of length 2, containing the lower and upper confidence bounds.

cex.xlab

character expansion factor for X axis annotation

ylim

Y axis limits. Will be determined automatically if NULL. If not NULL but only one limit is finite, the other will be determined automatically. Can be a list containing numeric vectors, in which case the limits will apply to the corresponding plot if multiple data sets are plotted. Will be recycled to the number of plots.

legendmargin

spacing between the upper-most data point/feature and the upper edge of the plot, required for group annotation. Will be determined automatically if NULL

plot.type

list containint three functions:

plot

function to do the actual plotting. See plotgroups.boxplot, plotgroups.beeswarm, plotgroups.barplot, plotgroups.vioplot.

ylim

Function to calculate Y axis limits based on data and features. Takes three arguments:

data

List of numeric vectors with data

stats

Precomputed statistics

features

Features to plot

Returns either a 2-element vector with Y limits or NULL, in which case Y limits will be computed based on sensible defaults.

features

Function to check user-supplied feature lists for correctness and compute default features, if necessary. Takes one argument (the user-supplied feature character vector) and returns a character vector with features to plot.

Can be a list of lists, in which case the elements will apply to the corresponding plot.

plot.fun.pars

additional parameters to pass to plot.type$plot

barwidth

width of the individual bars/boxes etc. as fraction of 1

main

main title

ylab

Y axis label. Will be recycled to the number of plots.

ylab.line

The margin line for the Y axis label.

signif.test

list of 2-element integer vectors giving the elements of data to be tested for significant differences. Can be a list of lists, in which case each element will apply to the corresponding plot if multiple data sets are plotted.

signif.test.fun

function to perform the significance testing. Must accept 2 vectors and return a list containing at least the element p.value. Can be a list of functions, in which case each element will apply to the corresponding plot if multiple data sets are plotted.

signif.test.text

function accepting a p-value and returning a formatted string to be used for plotting or NULL if this p-value is not to be plotted (e.g. if it is not significant). Can be a list of functions, in which case each element will apply to the corresponding plot if multiple data sets are plotted.

signif.test.col

color of p-value annotations.

signif.test.lwd

line width for p-value annotations. Can be a list, in which case the lwd will apply to the corresponding plot if multiple data sets are plotted.

signif.test.pars

parameters for group annotation. Will be passed to text. Can be a list of lists, in which case each element will apply to the corresponding plot if multiple data sets are plotted.

extrafun.before

additional function to call after the coordinate system has been set up, but before plotting, e.g. to add a background grid to the plot. Can be a list of functions, in which case each element will apply to the corresponding plot if multiple data sets are plotted.

extrafun.after

additional function to call after plotting, e.g. to add additional elements to the plot. Can be a list of functions, in which case each element will apply to the corresponding plot if multiple data sets are plotted.

...

additional parameters passed to par

Details

This is a wrapper function around plot.type$plot. It sets up the coordinate system, calls extrafun.before followed by plot.type$plot, which does the actual plotting, and extrafun.after. All three functions are passed the following arguments:

data

the data argument passed to plotgroups

at

X coordinates of the data. Particularly important when groups.spacing != 0

stats

summary statistics of the data. List with the following components:

means

means

sds

standard deviations

sems

standard errors of the mean

medians

medians

boxmax

third quartile

boxmin

tirst quartile

iqrmax

the maximal data point within range times the interquartile range of boxmax

iqrmin

the minimal data point within range times the interquartile range of boxmin

cimax

the upper confidence bound, computed by ci.fun according to conf.level

cimin

the lower confidence bound, computed by ci.fun according to conf.level

range

the range of the extreme data points within [iqrmin, iqrmax]

conf.level

the confidence level at which cimax, cimin apply

colors

the colors argument passed to plotgroups

features

the features argument passed to plotgroups

barwidth

the barwidth argument passed to plotgroups

plot.type$plot is additionally passed the arguments given by plot.fun.pars.

Significance testing is performed by calling signif.test.fun with two vector arguments containing the samples to be compared. signif.test.fun must return a list containing a p.value element. The p value is passed as single argument to signif.test.text which returns a character vector (or anything usable by text).

Value

list with the following components:

stats

summary statistics of the data.

features

Character vector of features actually plotted.

plotfun

Return value of plot.type$plot

at

X coordinates of the data.

annotation.height

Height of the annotation in inches.

annotation.width

Width of the annotation in inches. If names.style='combinatorial' this is the width of the left margin.

legendmargin

Top margin required for the legend, in user coordinates.

If significance testing was performed, also contains a component signiftest, which is a list with elements ordered by signif.test with the following components:

test

return value of the testing function

label

return value of signif.test.text

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
data <- list()
for (i in 1:14) data[[i]] <- rnorm(50, i, 0.5)
names <- rep(c('gene1', 'gene2', 'gene3', 'gene1 gene2', 'gene1 gene3', 'gene2 gene3', 'gene1 gene2 gene3'),
     times=2)
names2 <- as.character(rep(1:7,times=2))
names2[2] <- "abc\nefg"
colors <- c("green", "blue")
legend.text <- rep(c("protein1", "protein2"), each=7)
plotgroups(data, names, colors, legend.text,
           plot.type=plotgroups.beeswarm, features=c('mean', 'sd'), ylim=c(0,Inf))
plotgroups(data, names2, colors, legend.text,plot.type=plotgroups.vioplot, ylim=c(0,Inf),
           names.rotate=0, names.adj=c(0.5, 1))
plotgroups(data, names, colors, legend.text, log=TRUE,
           plot.type=plotgroups.beeswarm, features=c('mean', 'sd'),
           names.style='combinatorial', names.split=" ", names.pch='\u0394',
           plot.fun.pars=list(palpha=0.5, bxpcols="black"))
plotgroups(data, names, colors, legend.text,
           names.style='combinatorial', names.split=" ", names.pch='\u0394',
           names.placeholder='+')
plotgroups(data, names, colors, legend.text,
           names.style='combinatorial', names.split=" ", names.pch=19,
           main="test", plot.type=plotgroups.barplot, features=c("mean", "sd"),
           plot.fun.pars=list(whiskerswidth=0.6))

map.fun <- function(n, split, pch, cex, rotate, adj) {
               n <- strsplit(n, split, fixed=TRUE)[[1]]
               nlist <- lapply(n, function(x){
                                      if (x != "gene2") {
                                          list(pch=pch, cex=cex, rotate=rotate, adj=adj)
                                       } else {
                                           list(pch='S158T', cex=cex, rotate=90, adj=c(0,0.5))
                                       }
                                    })
                names(nlist) <- n
                nlist
}
plotgroups(data, names, colors, legend.text,names.style='combinatorial', names.split=" ",
           names.pch='\u0394', names.map.fun=map.fun)
## significance testing
plotgroups(data, names, colors, legend.text,names.style='combinatorial',
           names.split=" ", names.pch='\u0394',
           signif.test=list(c(1,3), c(2,5), c(5,8), c(3,10)))
plotgroups(data, names, colors, legend.text,names.style='combinatorial',
           names.split=" ", names.pch='\u0394',
           signif.test=list(c(1,3), c(2,5), c(5,8), c(3,10)),
           signif.test.text=function(p) {
                     if (p < 0.001) {
                         return('***')
                     } else if (p < 0.01) {
                         return('**')
                     } else if (p < 0.05) {
                         return('*')
                     } else {
                         return(NULL)
                     }})
## multiple plots
plotgroups(list(data, rev(data)), names, colors, legend.text,names.style='combinatorial',
           names.split=" ",names.pch='\u0394', names.map.fun=map.fun,
           ylim=c(0,Inf), ylab=c("data1", "data2"), main="test", features=list(NULL,
           c("median", "box")), plot.type=list(plotgroups.boxplot, plotgroups.beeswarm),
           signif.test=list(NULL,list(c(1,3), c(2,5), c(5,8), c(3,10))))

ilia-kats/imisc documentation built on May 18, 2019, 3:43 a.m.