plot_corr: Plots to explore the correlations between features
In brightchan/cjbmisc:

Plot the correlations focusing on a variable x vs all the rest of the variables. The workflow is: 1. remove small groups if "min.group.size" is defined; 2. calculate the p values for all pairs of variables 3. select the ones that pass pvalue threshold for plotting. Pvalues are by default non-parametric. Can choose if p.adjust should be used. 4. calculate padj and save the plots and pvalue tables.

plot_corr_one(
  plotdf,
  x.coln,
  y.coln,
  cat.num.test = "kruskal.test",
  num.num.test = "spearman",
  plot.it = F,
  plot.nrow = NULL,
  plot.ncol = NULL,
  signif.cutoff = 0.05,
  plot.stattest = "np",
  p.adj.method = NULL,
  plot.signif.only = F,
  plot.max = 40,
  min.group.size.x = 3,
  min.group.size.y = 3,
  seed = 999,
  outpdir = NULL,
  plot.w = 7,
  plot.h = 7.5,
  fn.suffix = "",
  ...
)

plot_corr(
  plotdf,
  x.coln,
  y.coln = NULL,
  cat.num.test = "kruskal.test",
  cat.cat.test = "both",
  num.num.test = "spearman",
  plot.it = F,
  plot.nrow = NULL,
  plot.ncol = NULL,
  signif.cutoff = 0.05,
  plot.stattest = "np",
  plot.signif.only = F,
  p.adj.method.each = NULL,
  p.adj.method.all = "bonferroni",
  plot.max = 50,
  seed = 999,
  outpdir = NULL,
  plot.w = 7,
  plot.h = 7.5,
  min.group.size.x = 3,
  min.group.size.y = 3,
  fn.suffix = "",
  ...
)

`plotdf`	dataframe with rows of samples and columns of features.
`x.coln`	column name of the x axis of the plot
`y.coln`	character vector of the column names of features to be plotted as y axis
`cat.num.test`	the significance test to be used for categorical vs numerical variables. Use the name of the r basic tests (Default "kruskal.test").
`num.num.test`	the significance test to be used for numerical vs numerical variables. Should be "spearman"(Default), "pearson", "kendall", or "lm"(using the pvalue of the independent variable in lm).
`plot.it`	Whether to plot it out (T/F)
`plot.nrow, plot.ncol`	The number of rows and columns in the combined plot
`plot.stattest`	Pass to the "type" parameter in `ggbarstats` , `ggbetweenstats` ,`ggscatterstats`, defining the stats test to be used. Default "np" is non-parametric.
`plot.signif.only`	whether to plot only the significant items
`plot.max`	maximum how many plots to be plotted. If set to NULL then plot all.
`min.group.size.x`	for categorical x, remove groups that are smaller than this number
`min.group.size.y`	for categorical y, remove groups that are smaller than this number
`outpdir`	If not NULL, save all plots and pvalues (as table) to the outpdir
`plot.w, plot.h`	Width and height of each individual plot
`fn.suffix`	filename suffix
`...`	pass to `ggbarstats` , `ggbetweenstats` ,`ggscatterstats`
`cat.cat.test`	the significance test to be used for categorical vs categorical variables. Should be "fisher","chi" or "both"(Default)
`padj.method, padj.method.each, padj.method.all`	pvalue adjustment method. Should follow `ggbetweenstats`. In plot_corr_one, if padj.method is specified, pvalue adjustment will be done and only those pass the padj threshold will be plotted. In plot_corr, padj.method.each will be passed to plot_corr_one while padj.method.all will be used to genearate the final padj table, adjusting for all pvalues.

For plot_corr, List of returns from plot_corr_one. For plot_corr_one: List of two: "plot" of ggarrange object which arrange all plot into one, and "pvalues" of a named vector.

brightchan/cjbmisc documentation built on Nov. 5, 2021, 4:12 p.m.