Plot a Gene Set Trends Heatmap for each Patient.
Description
This function plots a series of gene sets dynamic trends heatmaps. One heatmap is drawned for each patient. NOT IMPLEMENTED YET (TODO)
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  plotPat.TcGSA(x, threshold = 0.05, myproc = "BY", nbsimu_pval = 1e+06,
expr, Subject_ID, TimePoint, baseline = NULL, only.signif = TRUE,
group.var = NULL, Group_ID_paired = NULL, ref = NULL,
group_of_interest = NULL, FUNcluster = NULL,
clustering_metric = "euclidian", clustering_method = "ward", B = 500,
max_trends = 4, aggreg.fun = "median", methodOptiClust = "firstSEmax",
verbose = TRUE, clust_trends = NULL, N_clusters = NULL,
myclusters = NULL, label.clusters = NULL, prev_rowCL = NULL,
descript = TRUE, plotAll = TRUE, color.vec = c("darkred", "#D73027",
"#FC8D59", "snow", "#91BFDB", "#4575B4", "darkblue"), legend.breaks = NULL,
label.column = NULL, time_unit = "", cex.label.row = 1,
cex.label.column = 1, margins = c(5, 25), heatKey.size = 1,
dendrogram.size = 1, heatmap.height = 1, heatmap.width = 1,
cex.clusterKey = 1, cex.main = 1, horiz.clusterKey = TRUE,
main = NULL, subtitle = NULL, ...)

Arguments
x 
a tcgsa object. 
threshold 
the threshold at which the FDR or the FWER should be controlled. 
myproc 
a vector of character strings containing the names of the
multiple testing procedures for which adjusted pvalues are to be computed.
This vector should include any of the following: " 
nbsimu_pval 
the number of observations under the null distribution to
be generated in order to compute the pvalues. Default is 
expr 
either a matrix or dataframe of gene expression upon which
dynamics are to be calculated, or a list of gene sets estimation of gene
expression. In the case of a matrix or dataframe, its dimension are n
x p, with the p sample in column and the n genes in row.
In the case of a list, its length should correspond to the number of gene
sets under scrutiny and each element should be an 3 dimension array of
estimated gene expression, such as for the list returned in the

Subject_ID 
a factor of length p that is in the same order as the
columns of 
TimePoint 
a numeric vector or a factor of length p that is in
the same order as 
baseline 
a character string which is the value of 
only.signif 
logical flag for plotting only the significant gene sets.
If 
group.var 
in the case of several treatment groups, this is a factor of
length p that is in the same order as 
Group_ID_paired 
a character vector of length p that is in the
same order as 
ref 
the group which is used as reference in the case of several
treatment groups. Default is 
group_of_interest 
the group of interest, for which dynamics are to be
computed in the case of several treatment groups. Default is 
FUNcluster 
the clustering function used to agglomerate genes in
trends. Default is 
clustering_metric 
character string specifying the metric to be used
for calculating dissimilarities between observations in the hierarchical
clustering when 
clustering_method 
character string defining the agglomerative method
to be used in the hierarchical clustering when 
B 
integer specifying the number of Monte Carlo ("bootstrap") samples
used to compute the gap statistics. Default is 
max_trends 
integer specifying the maximum number of different clusters
to be tested. Default is 
aggreg.fun 
a character string such as 
methodOptiClust 
character string indicating how the "optimal"" number
of clusters is computed from the gap statistics and their standard
deviations. Possible values are 
verbose 
logical flag enabling verbose messages to track the computing
status of the function. Default is 
clust_trends 
object of class ClusteredTrends containing
already computed trends for the plotted gene sets. Default is 
N_clusters 
an integer that is the number of clusters in which the
dynamics should be regrouped. The cutoff of the clustering tree is
automatically calculated accordingly. Default is 
myclusters 
a character vector of colors for predefined clusters of the
represented genesets, with as many levels as the value of 
label.clusters 
if 
prev_rowCL 
a hclust object, such as the one return by the
present plotting funstion (see Value) for instance. If not 
descript 
logical flag indicating that the description of the gene sets
should appear after their name on the right side of the plot if 
plotAll 
logical flag indicating wether a first heatmap with the median
over all the patients should be plotted, or not. Default is 
color.vec 
a character strings vector used to define the color
palette used in the plot. Default is

legend.breaks 
a numeric vector indicating the splitting points for
coloring. Default is 
label.column 
a vector of character strings with the labels to be
displayed for the columns (i.e. the time points). Default is 
time_unit 
the time unit to be displayed (such as 
cex.label.row 
a numerical value giving the amount by which row labels
text should be magnified relative to the default 
cex.label.column 
a numerical value giving the amount by which column
labels text should be magnified relative to the default 
margins 
numeric vector of length 2 containing the margins (see

heatKey.size 
the size of the color key for the heatmap fill. Default
is 
dendrogram.size 
the horizontal size of the dendrogram. Default is

heatmap.height 
the height of the heatmap. Default is 
heatmap.width 
the width of the heatmap. Default is 
cex.clusterKey 
a numerical value giving the amount by which the
clusters legend text should be magnified relative to the default 
cex.main 
a numerical value giving the amount by which title text
should be magnified relative to the default 
horiz.clusterKey 
a logical flag; if 
main 
a character string for an optionnal title. Default is

subtitle 
a character string for an optionnal subtitle. Default is

... 
other parameters to be passed through to plotting functions. 
Details
On the heatmap, each line corresponds to a gene set, and each column to a timepoint.
First a heatmap is computed on all the patients (see plot.TcGSA
and clustTrend
) to define the clustering. Then, the clustering
and coloring thus defined on all the patients are consistently used in the
separate heatmaps that are plotted by patient.
If expr
is a matrix or a dataframe, then the "original" data are
plotted. On the other hand, if expr
is a list returned in the
'Estimations'
element of TcGSA.LR
, then it is those
"estimations" made by the TcGSA.LR
function that are plotted.
If descript
is FALSE
, the second element of margins
can
be reduced (for instance use margins = c(5, 10)
), as there is not so
much need for space in order to display only the gene set names, without
their description.
The median shown in the heatmap uses the respectively standardized (reduced and centered) expression of the genes over the patients.
Value
An object of class hclust which describes the tree produced by the clustering process. The object is a list with components:
merge an n1 by 2 matrix. Row i of
merge
describes the merging of clusters at step i of the clustering. If an element j in the row is negative, then observation j was merged at this stage. If j is positive then the merge was with the cluster formed at the (earlier) stage j of the algorithm. Thus negative entries in merge indicate agglomerations of singletons, and positive entries indicate agglomerations of nonsingletons.height a set of n1 real values (nondecreasing for ultrametric trees). The clustering height: that is, the value of the criterion associated with the Ward clustering method.
order a vector giving the permutation of the original observations suitable for plotting, in the sense that a cluster plot using this ordering and matrix merge will not have crossings of the branches.
labels the gene sets name.
call the call which produced the result clustering:
hclust(d = dist(map2heat, method = "euclidean"), method = "ward.D2")
method "ward.D2", as it is the clustering method that has been used for clustering the gene set trends.
dist.method "euclidean", as it is the distance that has been used for clustering the gene set trends.
legend.breaks a numeric vector giving the splitting points used for coloring the heatmap. If
plot
isFALSE
, then it isNULL
.myclusters a character vector of colors for clusters of the represented genesets, with as many levels as the value of
N_clusters
. If no clusters were represented, than this isNULL
.ddr a dendrogram object with the reordering used for the heatmap. See
heatmap.2
.clustersExport a data frame with 2 variables containing the two following variables :

GeneSet
: the gene sets clustered. 
Cluster
: the cluster they belong to.
The data frame is order by the variable
Cluster
.
Author(s)
Boris P. Hejblum
References
Hejblum BP, Skinner J, Thiebaut R, (2015) TimeCourse Gene Set Analysis for Longitudinal Gene Expression Data. PLoS Computat Biol 11(6): e1004310. doi: 10.1371/journal.pcbi.1004310
See Also
plot.TcGSA
, heatmap.2
,
TcGSA.LR
, hclust
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19  data(data_simu_TcGSA)
tcgsa_sim_1grp < TcGSA.LR(expr=expr_1grp, gmt=gmt_sim, design=design,
subject_name="Patient_ID", time_name="TimePoint",
time_func="linear", crossedRandom=FALSE)
plotPat.TcGSA(x=tcgsa_sim_1grp, expr=expr_1grp,
Subject_ID=design$Patient_ID, TimePoint=design$TimePoint,
B=100,
time_unit="H"
)
plotPat.TcGSA(x=tcgsa_sim_1grp, expr=tcgsa_sim_1grp$Estimations,
Subject_ID=design$Patient_ID, TimePoint=design$TimePoint,
baseline=1,
B=100,
time_unit="H"
)
