BIC)

sp_pheatmap

R Documentation

Generating pheatmap plot

Description

Generating pheatmap plot

Usage

sp_pheatmap(
  data,
  filename = NA,
  renameDuplicateRowNames = F,
  top_n = 1,
  statistical_value_type = mad,
  logv = NULL,
  log_add = 0,
  scale = "none",
  annotation_row = NULL,
  annotation_col = NULL,
  cluster_rows = FALSE,
  cluster_cols = FALSE,
  display_numbers = F,
  cluster_cols_variable = NULL,
  cluster_rows_variable = NULL,
  remove_cluster_cols_variable_in_annocol = FALSE,
  remove_cluster_rows_variable_in_annorow = FALSE,
  clustering_method = "complete",
  clustering_distance_rows = "pearson",
  clustering_distance_cols = "pearson",
  label_row_cluster_boundary = FALSE,
  label_col_cluster_boundary = FALSE,
  label_every_n_rowitems = 1,
  label_every_n_colitems = 1,
  breaks = NA,
  breaks_mid = NULL,
  breaks_digits = 2,
  correlation_plot = "None",
  maximum = Inf,
  minimum = -Inf,
  xtics_angle = 0,
  manual_color_vector = NULL,
  fontsize = 14,
  manual_annotation_colors_sidebar = NULL,
  cutree_cols = NA,
  cutree_rows = NA,
  anno_cutree_cols = F,
  anno_cutree_rows = F,
  kclu = NA,
  ytics = TRUE,
  xtics = TRUE,
  width = 0,
  height = 0,
  title = "",
  debug = FALSE,
  saveppt = FALSE,
  ...
)

Arguments

`data`	Data file or dataframe (with header line, the first column is the rowname, tab seperated. Colnames normally should be unique unless you know what you are doing.)
`filename`	Filename for output files.
`renameDuplicateRowNames`	Specify the way to deal with duplicate row names. Default FALSE: representing duplicated row names are not allowed. Accept TRUE: representing make duplicated row names unique by adding <.1>, <.2> for the second, third appearances.
`top_n`	An integer larger than 1 will be used to get top x genes (like top 5000). A float number less than 1 will be used to get top x fraction genes (like top 0.7 of all genes).
`statistical_value_type`	Specify the way for statistical computation. Default mad, accept mean, var, sum, median.
`logv`	First get log-value, then do other analysis. Accept an R function log2 or log10. Default FALSE.
`log_add`	A value to add before log-transfer in-case log zero. Default 0 the program will automatically choose value to add.
`scale`	Scale the data or not for clustering and visualization. Default 'none' means no scale, accept 'row', 'column' to scale by row or column.
`annotation_row`	A file or datafrmae to specify row-annotation with first column same as first column of `data`. Default NULL.
`annotation_col`	A file or datafrmae to specify col-annotation with first column sanme as first row of `data`. Default NULL.
`cluster_rows`	Hieratical cluster for rows. Default FALSE, accept TRUE. When there are less than 3 rows or more than 5000 rows, this parameter would always be set to FALSE.
`cluster_cols`	Hieratical cluster for columns. Default FALSE, accept TRUE. When there are less than 3 columns or more than 5000 columns, this parameter would always be set to FALSE.
`display_numbers`	logical determining if the numeric values are also printed to the cells. If this is a matrix (with same dimensions as original matrix), the contents of the matrix are shown instead of original values.
`cluster_cols_variable`	Reorder branch order of clustered columns by given variable. (Test only)
`cluster_rows_variable`	Reorder branch order of clustered rows by given variable. (Test only)
`remove_cluster_cols_variable_in_annocol`	Do not show `cluster_cols_variable` in column annotation.
`remove_cluster_rows_variable_in_annorow`	Do not show `cluster_rows_variable` in row annotation.
`clustering_method`	Clustering method, Default "complete". Accept "ward.D", "ward.D2","single", "average" (=UPGMA), "mcquitty" (=WPGMA), "median" (=WPGMC) or "centroid" (=UPGMC)
`clustering_distance_rows`	Clustering distance method for rows. Default 'pearson', accept 'spearman','euclidean', "manhattan", "maximum", "canberra", "binary", "minkowski", "bray", "kulczynski", "jaccard", "gower", "altGower", "morisita", "horn", "mountford", "raup" , "binomial", "chao", "cao", "mahalanobis". (Some need vegan package)
`clustering_distance_cols`	Clustering distance method for cols. Default 'pearson', accept 'spearman','euclidean', "manhattan", "maximum", "canberra", "binary", "minkowski", "bray", "kulczynski", "jaccard", "gower", "altGower", "morisita", "horn", "mountford", "raup" , "binomial", "chao", "cao", "mahalanobis". (Some need vegan package)
`label_row_cluster_boundary`	Only display labels of row cluster boundary (w) (the first item in cluster start).
`label_col_cluster_boundary`	Only display labels of column cluster boundary (x) (the first item in cluster start).
`label_every_n_rowitems`	Label every n row items (n>1). (Default 1 means labeling all row items. Supplying a large number when there are many rows to label only few rows. For a data matrix with 1000 rows, giving 10 here, will only label 10 genes, the 1st, 11st, 21st, ... 91st) (y)
`label_every_n_colitems`	Label every n column items (n>1) (Z) (Default 1 means labeling all column items. Supplying a large number when there are many columns to label only few columns. For a data matrix with 1000 columns, giving 10 here, will only label 10 genes, the 1st, 11st, 21st, ... 91st)
`breaks`	A sequence of numbers that covers the range of values in mat and is one element longer than color vector. Used for mapping values to colors. Useful, if needed to map certain values to certain colors, to certain values. If value is NA then the breaks are calculated automatically. if value is `quantile`, then the breaks would be computed to generate each quantile.
`breaks_mid`	Mid value for generating breaks when `quantile` is assigned to break.
`breaks_digits`	Number of digits kept for breaks. Default 2.
`correlation_plot`	First compute the correlation matrix of given `data`, then heatmap correlation data instead of raw data. Default "None", accept "row" or "col" for row correlation or column correlation.
`maximum`	The maximum value one want to keep, any number larger than given value would be taken as this given maximum value. Default Inf, Optional.
`minimum`	The smallest value one want to keep, any number smaller will be taken as this given minimum value. Default -Inf, Optional.
`xtics_angle`	Rotation angle for x-axis value. Default 0.
`manual_color_vector`	Manually set colors for each geom. Default NULL, meaning using ggplot2 default. Colors like c('red', 'blue', '#6181BD') (number of colors not matter) or a RColorBrewer color set like "BrBG" "PiYG" "PRGn" "PuOr" "RdBu" "RdGy" "RdYlBu" "RdYlGn" "Spectral" "Accent" "Dark2" "Paired" "Pastel1" "Pastel2" "Set1" "Set2" "Set3" "Blues" "BuGn" "BuPu" "GnBu" "Greens" "Greys" "Oranges" "OrRd" "PuBu" "PuBuGn" "PuRd" "Purples" "RdPu" "Reds" "YlGn" "YlGnBu" "YlOrBr" "YlOrRd" (check http://www.sthda.com/english/wiki/colors-in-r for more).
`fontsize`	Font size. Default 14.
`manual_annotation_colors_sidebar`	Annotation color. One can only specify color for each column of row-annotatation or col-annotation. For example, 'class' (two values: C1, C2) and group' (two values:G1, G2) are two row-annotations, 'type' (three values, T1, T2, T3) and 'size' (four values, 1, 2, 3, 4) are two col-annoations. Colors can be specified in a string as `'class=c(C1="blue", C2="yellow"), size=c("white", "green"), type=c(T1="pink", T2="black", T3="cyan")'` or a list as `list(class=c(C1="blue", C2="yellow"),size=c("white", "green"))`. In R, one can use colors() function to get names of all available colors.
`cutree_cols`	similar to `cutree_rows`, but for columns
`cutree_rows`	number of clusters the rows are divided into, based on the hierarchical clustering (using cutree), if rows are not clustered, the argument is ignored
`anno_cutree_cols`	Add column tree-cut results as column annotation.
`anno_cutree_rows`	Add row tree-cut results as row annotation.
`kclu`	Aggregate the rows using kmeans clustering. This is advisable if number of rows is so big that R cannot handle their hierarchical clustering anymore, roughly more than 1000. Instead of showing all the rows separately one can cluster the rows in advance and show only the cluster centers. The number of clusters can be tuned here. Default 'NA' which means no cluster, other positive interger is accepted for executing kmeans cluster, also the parameter represents the number of expected clusters
`ytics`	Display ytics.
`xtics`	Display xtics.
`width`	Picture width
`height`	Picture height
`title`	Title of picture. Default empty title
`saveppt`	Whether to output PPT format. Default false, doesn't output. Accept TRUE, will output ppt file.
`...`	Other parameters given to pheatmap.

Value

Generate a PDF and TXT file.

Examples

a = c(12,14,17,11,16)
b = c(4,20,15,11,9)
c = c(5,7,19,8,18)
d = c(15,13,11,17,16)
e = c(12,19,16,7,9)
pheatmap_data = as.data.frame(cbind(a,b,c,d,e))
sp_pheatmap(data = pheatmap_data)

## Not run:
pheatmap_data = "pheatmap.data"
sp_pheatmap(data = pheatmap_data)
## End(Not run)

Tong-Chen/ImageGP documentation built on April 14, 2025, 12:54 p.m.