plotDistribution | R Documentation |
Plot sample distribution
Description
The tooltip shows the median, variance, maximum, minimum and number of non-NA
samples of each data series, as well as sample names if available.
Usage
plotDistribution(
data,
groups = NULL,
rug = length(data) < 500,
vLine = TRUE,
...,
title = NULL,
subtitle = NULL,
type = c("density", "boxplot", "violin"),
invertAxes = FALSE,
psi = NULL,
rugLabels = FALSE,
rugLabelsRotation = 0,
legend = TRUE,
valueLabel = NULL
)
Arguments
data |
Numeric, data frame or matrix: gene expression data or
alternative splicing event quantification values (sample names are based on
their names or colnames )
|
groups |
List of sample names or vector containing the group name per
data value (read Details); if NULL or a character vector of
length 1, data values are considered from the same group
|
rug |
Boolean: show rug plot?
|
vLine |
Boolean: plot vertical lines (including descriptive statistics
for each group)?
|
... |
Arguments passed on to stats::density.default
bw the smoothing bandwidth to be used. The kernels are scaled
such that this is the standard deviation of the smoothing kernel.
(Note this differs from the reference books cited below, and from S-PLUS.)
bw can also be a character string giving a rule to choose the
bandwidth. See bw.nrd . The default,
"nrd0" , has remained the default for historical and
compatibility reasons, rather than as a general recommendation,
where e.g., "SJ" would rather fit, see also Venables and
Ripley (2002).
The specified (or computed) value of bw is multiplied by
adjust .
adjust the bandwidth used is actually adjust*bw .
This makes it easy to specify values like ‘half the default’
bandwidth.
kernel,window a character string giving the smoothing kernel
to be used. This must partially match one of "gaussian" ,
"rectangular" , "triangular" , "epanechnikov" ,
"biweight" , "cosine" or "optcosine" , with default
"gaussian" , and may be abbreviated to a unique prefix (single
letter).
"cosine" is smoother than "optcosine" , which is the
usual ‘cosine’ kernel in the literature and almost MSE-efficient.
However, "cosine" is the version used by S.
weights numeric vector of non-negative observation weights,
hence of same length as x . The default NULL is
equivalent to weights = rep(1/nx, nx) where nx is the
length of (the finite entries of) x[] . If na.rm = TRUE
and there are NA 's in x , they and the
corresponding weights are removed before computations. In that case,
when the original weights have summed to one, they are re-scaled to
keep doing so.
Note that weights are not taken into account for automatic
bandwidth rules, i.e., when bw is a string. When the weights
are proportional to true counts cn , density(x = rep(x, cn))
may be used instead of weights .
width this exists for compatibility with S; if given, and
bw is not, will set bw to width if this is a
character string, or to a kernel-dependent multiple of width
if this is numeric.
give.Rkern logical; if true, no density is estimated, and
the ‘canonical bandwidth’ of the chosen kernel is returned
instead.
subdensity used only when weights are specified which do not sum
to one. When true, it indicates that a “sub-density”
is desired and no warning should be signalled. By default, when false,
a warning is signalled when the weights do not sum to one.
warnWbw logical , used only when weights are specified and
bw is character , i.e., automatic bandwidth selection is
chosen (as by default). When true (as by default), a
warning is signalled to alert the user that automatic
bandwidth selection will not take the weights into account and hence
may be suboptimal.
n the number of equally spaced points at which the density is
to be estimated. When n > 512 , it is rounded up to a power
of 2 during the calculations (as fft is used) and the
final result is interpolated by approx . So it almost
always makes sense to specify n as a power of two.
from,to the left and right-most points of the grid at which the
density is to be estimated; the defaults are cut * bw outside
of range(x) .
cut by default, the values of from and to are
cut bandwidths beyond the extremes of the data. This allows
the estimated density to drop to approximately zero at the extremes.
|
title |
Character: plot title
|
subtitle |
Character: plot subtitle
|
type |
Character: density , boxplot or violin plot
|
invertAxes |
Boolean: plot X axis as Y and vice-versa?
|
psi |
Boolean: are data composed of PSI values? If NULL ,
psi = TRUE if all data values are between 0 and 1
|
rugLabels |
Boolean: plot sample names in the rug?
|
rugLabelsRotation |
Numeric: rotation (in degrees) of rug labels; this
may present issues at different zoom levels and depending on the proximity
of data values
|
legend |
Boolean: show legend?
|
valueLabel |
Character: label for the value (by default, either
Inclusion levels or Gene expression )
|
Details
Argument groups
can be either:
a list of sample names, e.g.
list("Group 1"=c("Sample A", "Sample B"), "Group 2"=c("Sample C")))
a character vector with the same length as data
, e.g.
c("Sample A", "Sample C", "Sample B")
.
Value
highchart
object with density plot
See Also
Other functions to perform and plot differential analyses:
diffAnalyses()
Examples
data <- sample(20, rep=TRUE)/20
groups <- paste("Group", c(rep("A", 10), rep("B", 10)))
names(data) <- paste("Sample", seq(data))
plotDistribution(data, groups)
# Using colours
attr(groups, "Colour") <- c("Group A"="pink", "Group B"="orange")
plotDistribution(data, groups)