get_signatures-ConsensusPartition-method | R Documentation |
Get signature rows
## S4 method for signature 'ConsensusPartition'
get_signatures(object, k,
col = if(scale_rows) c("green", "white", "red") else c("blue", "white", "red"),
silhouette_cutoff = 0.5,
fdr_cutoff = cola_opt$fdr_cutoff,
top_signatures = NULL,
group_diff = cola_opt$group_diff,
scale_rows = object@scale_rows, .scale_mean = NULL, .scale_sd = NULL,
row_km = NULL,
diff_method = c("Ftest", "ttest", "samr", "pamr", "one_vs_others", "uniquely_high_in_one_group"),
anno = get_anno(object),
anno_col = get_anno_col(object),
internal = FALSE,
show_row_dend = FALSE,
show_column_names = FALSE,
column_names_gp = gpar(fontsize = 8),
use_raster = TRUE,
plot = TRUE, verbose = TRUE, seed = 888,
left_annotation = NULL, right_annotation = NULL,
simplify = FALSE, prefix = "", enforce = FALSE, hash = NULL, from_hc = FALSE,
...)
object |
A |
k |
Number of subgroups. |
col |
Colors for the main heatmap. |
silhouette_cutoff |
Cutoff for silhouette scores. Samples with values less than it are not used for finding signature rows. For selecting a proper silhouette cutoff, please refer to https://www.stat.berkeley.edu/~s133/Cluster2a.html#tth_tAb1. |
fdr_cutoff |
Cutoff for FDR of the difference test between subgroups. |
top_signatures |
Top signatures with most significant fdr. Note since fdr might be same for multiple rows, the final number of signatures might not be exactly the same as the one that has been set. |
group_diff |
Cutoff for the maximal difference between group means. |
scale_rows |
Whether apply row scaling when making the heatmap. |
.scale_mean |
Internally used. |
.scale_sd |
Internally used. |
row_km |
Number of groups for performing k-means clustering on rows. By default it is automatically selected. |
diff_method |
Methods to get rows which are significantly different between subgroups, see 'Details' section. |
anno |
A data frame of annotations for the original matrix columns. By default it uses the annotations specified in |
anno_col |
A list of colors (color is defined as a named vector) for the annotations. If |
internal |
Used internally. |
show_row_dend |
Whether show row dendrogram. |
show_column_names |
Whether show column names in the heatmap. |
column_names_gp |
Graphics parameters for column names. |
use_raster |
Internally used. |
plot |
Whether to make the plot. |
verbose |
Whether to print messages. |
seed |
Random seed. |
left_annotation |
Annotation put on the left of the heatmap. It should be a |
right_annotation |
Annotation put on the right of the heatmap. Same format as |
simplify |
Only used internally. |
prefix |
Only used internally. |
enforce |
The analysis is cached by default, so that the analysis with the same input will be automatically extracted without rerunning them. Set |
hash |
Userd internally. |
from_hc |
Is the |
... |
Other arguments. |
Basically the function applies statistical test for the difference in subgroups for every row. There are following methods which test significance of the difference:
First it looks for the subgroup with highest mean value, compare to each of the other subgroups with t-test and take the maximum p-value. Second it looks for the subgroup with lowest mean value, compare to each of the other subgroups again with t-test and take the maximum p-values. Later for these two list of p-values take the minimal p-value as the final p-value.
use SAM (from samr package)/PAM (from pamr package) method to find significantly different rows between subgroups.
use F-test to find significantly different rows between subgroups.
For each subgroup i in each row, it uses t-test to compare samples in current subgroup to all other samples, denoted as p_i. The p-value for current row is selected as min(p_i).
The signatures are defined as, if they are uniquely up-regulated in subgroup A, then it must fit following criterions: 1. in a two-group t-test of A ~ other_merged_groups, the statistic must be > 0 (high in group A) and p-value must be significant, and 2. for other groups (excluding A), t-test in every pair of groups should not be significant.
diff_method
can also be a self-defined function. The function needs two arguments which are the matrix for the analysis
and the predicted classes. The function should returns a vector of FDR from the difference test.
A data frame with more than two columns:
which_row
:row index corresponding to the original matrix.
fdr
:the FDR.
km
:the k-means groups if row_km
is set.
the mean value (depending rows are scaled or not) in each subgroup.
Zuguang Gu <z.gu@dkfz.de>
data(golub_cola)
res = golub_cola["ATC", "skmeans"]
tb = get_signatures(res, k = 3)
head(tb)
get_signatures(res, k = 3, top_signatures = 100)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.