clrDR | R Documentation |
Computes centered log-ratios (CLR) on cluster/sample proportions across samples/clusters, and visualizes them in a lower-dimensional space, highlighting differences in composition between samples/clusters.
clrDR(
x,
dr = c("PCA", "MDS", "UMAP", "TSNE", "DiffusionMap"),
by = c("sample_id", "cluster_id"),
k = "meta20",
dims = c(1, 2),
base = 2,
arrows = TRUE,
point_col = switch(by, sample_id = "condition", "cluster_id"),
arrow_col = switch(by, sample_id = "cluster_id", "condition"),
arrow_len = 0.5,
arrow_opa = 0.5,
label_by = NULL,
size_by = TRUE,
point_pal = NULL,
arrow_pal = NULL
)
x |
a |
dr |
character string specifying which dimension reduction to use. |
by |
character string specifying across which IDs to compute CLRs
|
k |
character string specifying which clustering to use;
valid values are |
dims |
two numeric scalars indicating which dimensions to plot. |
base |
integer scalar specifying the logarithm base to use. |
arrows |
logical specifying whether to include arrows for PC loadings. |
point_col , arrow_col |
character string specifying a non-numeric
cell metadata column to color points and PC loading arrows by;
valid values are |
arrow_len |
non-zero single numeric specifying the length of loading vectors relative to the largest xy-coordinate in the embedded space; NULL for no re-sizing (see details). |
arrow_opa |
single numeric in [0,1] specifying the opacity (alpha) of PC loading arrows when they are grouped; 0 will hide individual arrows. |
label_by |
character string specifying a non-numeric sample metadata
variable to label points by; valid values are |
size_by |
logical specifying whether to scale point sizes by the number
of cells in a given sample/cluster (for |
point_pal , arrow_pal |
character string of colors to use
for points and PC loading arrows. Arguments default to
|
Let k
be one of S
samples, k
one of K
clusters,
and p(s,k)
be the proportion of cells from s
in k
.
The centered log-ratio (CLR) is defined as
clr(sk) = log p(s,k) - \sum p(s,k) / K
and analogous for clusters replacing s
by k
and K
by
S
. Thus, each sample/cluster gives a vector with length K/S
and mean 0, and the CLRs computed across all instances can be represented
as a matrix with dimensions S
x K
(or K
x S
for clusters) that we embed into a lower dimensional space.
In principle, clrDR
allows any dimension reduction to be applied on
the CLRs. The default method (dr = "PCA"
) will include the percentage
of variance explained by each principal component (PC) in the axis labels.
Noteworthily, distances between points in the lower-dimensional space are
meaningful only for linear DR methods (PCA and MDS), and results obtained
from other methods should be interpreted with caution. Thus, the output
plot's aspect ratio should be kept as is for PCA and MDS; non-linear
DR methods can use aspect.ratio = 1
, rendering a square plot.
For dr = "PCA"
, PC loadings will be represented as arrows that may be
interpreted as follows: 0° (180°) between vectors indicates a strong positive
(negative) relation between them, while vectors that are orthogonal to each
another (90°) are roughly independent.
When a vector points towards a given quadrant, the variability in proportions for the points within this quadrant are largely driven by the corresponding variable. Here, only the relative orientation of vectors to one another and to the PC axes is meaningful; however, the sign of loadings (i.e., whether an arrow points left or right) can be flipped when re-computing PCs.
When arrow_len
is specified, PC loading vectors will be re-scaled to
improve their visibility. Here, a value of 1 will stretch vectors such that
the largest loading will touch on the outer most point. Importantly, while
absolute arrow lengths are not interpretable, their relative length is.
a ggplot
object.
Helena L Crowell helena.crowell@uzh.ch
data(PBMC_fs, PBMC_panel, PBMC_md)
sce <- prepData(PBMC_fs, PBMC_panel, PBMC_md)
sce <- cluster(sce)
# CLR on sample proportions across clusters
# (1st vs. 3rd PCA; include sample labels)
clrDR(sce, by = "sample_id", k = "meta12",
dims = c(1, 3), label_by = "sample_id")
# CLR on cluster proportions across samples
# (use custom colors for both points & loadings)
clrDR(sce, by = "cluster_id",
point_pal = hcl.colors(10, "Spectral"),
arrow_pal = c("royalblue", "orange"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.