plot: Visualize results from coseq clustering

Description Usage Arguments Value Author(s) Examples

Description

Plot a coseqResults object.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
plot(x, ...)

## S4 method for signature 'coseqResults'
plot(
  x,
  y_profiles = NULL,
  K = NULL,
  threshold = 0.8,
  conds = NULL,
  average_over_conds = FALSE,
  collapse_reps = "none",
  graphs = c("logLike", "ICL", "profiles", "boxplots", "probapost_boxplots",
    "probapost_barplots", "probapost_histogram"),
  order = FALSE,
  profiles_order = NULL,
  n_row = NULL,
  n_col = NULL,
  add_lines = TRUE,
  ...
)

coseqGlobalPlots(object, graphs = c("logLike", "ICL"), ...)

coseqModelPlots(
  probaPost,
  y_profiles,
  K = NULL,
  threshold = 0.8,
  conds = NULL,
  collapse_reps = "none",
  graphs = c("profiles", "boxplots", "probapost_boxplots", "probapost_barplots",
    "probapost_histogram"),
  order = FALSE,
  profiles_order = NULL,
  n_row = NULL,
  n_col = NULL,
  add_lines = TRUE,
  ...
)

Arguments

x

An object of class "coseqResults"

...

Additional optional plotting arguments (e.g., xlab, ylab, use_sample_names, facet_labels)

y_profiles

y (n x q) matrix of observed profiles for n observations and q variables to be used for graphing results (optional for logLike, ICL, probapost_boxplots, and probapost_barplots, and by default takes value x$tcounts if NULL)

K

If desired, the specific model to use for plotting (or the specific cluster number(s) to use for plotting in the case of coseqModelPlots). If NULL, all clusters will be visualized, and the model chosen by ICL will be plotted

threshold

Threshold used for maximum conditional probability; only observations with maximum conditional probability greater than this threshold are visualized

conds

Condition labels, if desired

average_over_conds

If TRUE, average values of y_profiles within each condition identified by conds for the profiles and boxplots plots. This argument is redundant to collapse_reps = "sum", and collapse_reps should be used instead.

collapse_reps

If "none", display all replicates. If "sum", collapse replicates within each condition by summing their profiles If "average", collapse replicates within each condition by averaging their profiles. For highly unbalanced experimental designs, using "average" will likely provide more easily interpretable plots.

graphs

Graphs to be produced, one (or more) of the following: "logLike" (log-likelihood plotted versus number of clusters), "ICL" (ICL plotted versus number of clusters), "profiles" (line plots of profiles in each cluster), "boxplots" (boxplots of profiles in each cluster), "probapost_boxplots" (boxplots of maximum conditional probabilities per cluster), "probapost_barplots" (number of observations with a maximum conditional probability greater than threshold per cluster), "probapost_histogram" (histogram of maximum conditional probabilities over all clusters)

order

If TRUE, order clusters in probapost_boxplot by median and probapost_barplot by number of observations with maximum conditional probability greater than threshold

profiles_order

If NULL or FALSE, line plots and boxplots of profiles are plotted sequentially by cluster number (K=1, K=2, ...). If TRUE, line plots and boxplots of profiles are plotted in an automatically calculated order (according to the Euclidean distance between cluster means) to plot clusters with similar mean profiles next to one another. Otherwise, the user may provide a vector (of length equal to the number of clusters in the given model) providing the desired order of plots.

n_row

Number of rows for plotting layout of line plots and boxplots of profiles. Note that if n_row x n_col is less than the total number of clusters plotted, plots will be divided over multiple pages.

n_col

Number of columns for plotting layout of line plots and boxplots of profiles. Note that if n_row x n_col is less than the total number of clusters plotted, plots will be divided over multiple pages.

add_lines

If TRUE, add red lines representing means to boxplots; if FALSE, these will be suppressed.

object

An object of class "RangedSummarizedExperiment" arising from a call to NormMixClus

probaPost

Matrix or data.frame of dimension (n x K) containing the conditional probilities of cluster membership for n genes in K clusters arising from a mixture model

Value

Named list of plots of the coseqResults object.

Author(s)

Andrea Rau, Cathy Maugis-Rabusseau

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
## Simulate toy data, n = 300 observations
set.seed(12345)
countmat <- matrix(runif(300*4, min=0, max=500), nrow=300, ncol=4)
countmat <- countmat[which(rowSums(countmat) > 0),]
conds <- rep(c("A","B","C","D"), each=2)

## Run the Normal mixture model for K = 2,3,4
run_arcsin <- coseq(object=countmat, K=2:4, iter=5, transformation="arcsin",
                    model="Normal", seed=12345)
run_arcsin

## Plot and summarize results
plot(run_arcsin)
summary(run_arcsin)

## Compare ARI values for all models (no plot generated here)
ARI <- compareARI(run_arcsin, plot=FALSE)

## Compare ICL values for models with arcsin and logit transformations
run_logit <- coseq(object=countmat, K=2:4, iter=5, transformation="logit",
                   model="Normal")
compareICL(list(run_arcsin, run_logit))

## Use accessor functions to explore results
clusters(run_arcsin)
likelihood(run_arcsin)
nbCluster(run_arcsin)
ICL(run_arcsin)

## Examine transformed counts and profiles used for graphing
tcounts(run_arcsin)
profiles(run_arcsin)

## Run the K-means algorithm for logclr profiles for K = 2,..., 20
run_kmeans <- coseq(object=countmat, K=2:20, transformation="logclr",
                    model="kmeans")
run_kmeans

coseq documentation built on Nov. 8, 2020, 5:18 p.m.