CorDiffViz: Visualization for Differential Correlation Matrices

viz	R Documentation

Main function for estimating and writing self/differential correlation matrices to local files.

Description

Main function for estimating and writing self/differential correlation matrices to local files.

Usage

viz(
  run_name,
  dat1X,
  dat2X,
  dat1Y = NULL,
  dat2Y = NULL,
  name_dat1 = "1",
  name_dat2 = "2",
  cor_names = c("pearson", "kendall", "spearman", "sin_kendall", "sin_spearman"),
  permutation = TRUE,
  alpha = 0.05,
  sides = 2,
  B = 1000,
  adj_method = "BY",
  parallel = FALSE,
  verbose = TRUE,
  make_plot = TRUE,
  perm_seed = NULL,
  Cai_seed = NULL,
  layout_seed = NULL
)

Arguments

`run_name`	A string, a given name for this run/function call. Files for visualization will be saved under `file.path("dats", run_name)`. Examples include `MyFirstData_run1`, `MyFirstData_run2`, `MySecondData_run1`, where each call is run with different arguments to `viz()`, e.g. on different datasets or with different parameters.
`dat1X`	A matrix data for group X for the first sample; see details. Must not be `NULL` and must have the same number of columns as `dat2X`.
`dat2X`	A matrix data for group X for the second sample; see details. Must not be `NULL` and must have the same number of columns as `dat1X`.
`dat1Y`	Optional, a matrix data for group Y for the first sample and defaults to `NULL`; see details. If not `NULL`, must have the same number of rows as `dat1X` and same number of columns as `dat2Y`, and `dat2Y` must not be `NULL`.
`dat2Y`	Optional, a matrix data for group X for the second sample and defaults to `NULL`; see details. If not `NULL`, must have the same number of rows as `dat2X` and same number of columns as `dat1Y`, and `dat1Y` must not be `NULL`.
`name_dat1`	A string, name for the first sample. Defaults to "1".
`name_dat2`	A string, name for the second sample. Defaults to "2".
`cor_names`	A string or a vector of strings, name(s) of correlation types to be estimated. Must be chosen from `"pearson"`, `"kendall"`, `"spearman"`, `"sin_kendall"`, and `"sin_spearman"`.
`permutation`	Logical, indicating whether permutation tests should be done in addition to parametric tests; defaults to `TRUE`.
`alpha`	Numerical, the significance level in hypothesis testing; defaults to 0.05. Used to produce the heat maps. This parameter does not affect the interactive visualization in the browser since the user can manually change the significance level there.
`sides`	A number `1`, `2`, `3` or a matrix containing `1`, `2`, `3`. If a matrix, must be of size `ncol(dat1X) x ncol(dat1X)` if `dat1Y` is `NULL`, or `ncol(dat1X) x ncol(dat1Y)` otherwise. `2` stands for two-sided tests, `1` for one-sided test with null hypothesis being the corresponding entries >= 0 (the corresponding correlation for sample 1 stronger than that for sample 2), and `3` for one-sided test with null hypothesis being the corresponding entries <= 0.
`B`	An integer, the number of bootstrapping samples in permutation tests; defaults to 1000.
`adj_method`	A string, the method passed to `stats::p.adjust` for adjusting the p values for multiple testing; defaults to "BY". Must be one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", or "none".
`parallel`	A logical, whether to use parallel computing; may fail sometimes for some systems and defaults to `FALSE`.
`verbose`	A logical, whether to print progress; defaults to `TRUE`.
`make_plot`	A logical, whether to make heat maps and static graphs; defaults to `TRUE`. Plots will be made under `file.path("plots", run_name)`.
`perm_seed`	A number, seed for permutation test; defaults to `NULL`.
`Cai_seed`	A number, seed for the method by Cai and Zhang; defaults to `NULL`.
`layout_seed`	A number, seed for the layout of the static graphs; defaults to `NULL`.

Details

Files created can be found under the current working directory.

To estimate the differential correlations under two conditions (1 and 2), dat1X and dat2X should contain data for conditions 1 and 2, respectively. For both dat1X and dat2X, each row should contain the measurements for one sample/observation/subject, and each column corresponds to one variable/covariate. dat1Y and dat2Y should be set to NULL.

To estimate the differential cross-correlations between variables in group X and variables in group Y under two conditions, dat1X and dat2X should contain data for conditions 1 and 2, respectively, whose columns correspond to variables in group X. Likewise, dat1Y and dat2Y should be non-NULL and contain measurements for variables in the Y group, under conditions 1 and 2, respectively.

If dat1Y and dat2Y are NULL, the function estimates the difference cor(dat1X) - cor(dat2X) and truncates to 0 the entries that are below a certain threshold determined by parameteric/permutation tests.

If dat1Y and dat2Y are not NULL, the difference in the cross-correlations cor(dat1X, dat1Y) - cor(dat2X, dat2Y) is estimated.

The dimensions must be as follows: dat1X has dimension n1 x pX, dat2X n2 x pX, and if provided, dat1Y n1 x pY and dat2Y n2 x pY. The column names will be used as names for each variable/covariate, and the row names will be used as identifier for each sample/observation/subject.

Value

Does not return anything, but instead creates relevant folders and files under the current working directory under file.path("dats", run_name) and file.path("plots", run_name). The folder plots contains static heat maps for the user, while the folder dats contains data files internally used by the interactive visualization HTML file.

Examples

dat0 <- read.csv(file.path(path.package("CorDiffViz"), "extdata/sample_data.csv"))
# First column of dat0 is the group (dat1 or dat2)
dat1 <- dat0[dat0$Group=="AA", 2:10][1:13,] # 13 x 9
dat2 <- dat0[dat0$Group=="BB", 2:10][1:15,] # 15 x 9
# Self correlations
viz(run_name="exmp_self", dat1X=dat1, dat2X=dat2, dat1Y=NULL, dat2Y=NULL,
    name_dat1="AA", name_dat2="BB", 
    cor_names=c("pearson","spearman", "kendall","sin_spearman","sin_kendall"),
    permutation=TRUE, alpha=0.05, sides=2, B=1000, adj_method="BY", verbose=TRUE,
    make_plot=TRUE, parallel=FALSE, perm_seed=1, Cai_seed=1, layout_seed=1)
# Correlations between variables in group X = {1:4} and variables in group Y = {5:9}
viz(run_name="exmp_XY", dat1X=dat1[,1:(ncol(dat1)/2)], dat2X=dat2[,1:(ncol(dat1)/2)], 
    dat1Y=dat1[,(ncol(dat1)/2+1):ncol(dat1)], dat2Y=dat2[,(ncol(dat1)/2+1):ncol(dat1)], 
    name_dat1="AA", name_dat2="BB", 
    cor_names=c("pearson","spearman", "kendall","sin_spearman","sin_kendall"), 
    permutation=TRUE, alpha=0.05, sides=2, B=1000, adj_method="BY", verbose=TRUE, 
    make_plot=TRUE, parallel=FALSE, perm_seed=1, layout_seed=1)
    
# Remove folders for the examples generated above
unlink(c("dats/exmp_self", "dats/exmp_XY", "plots/exmp_self", "plots/exmp_XY"), recursive=TRUE)
setup_js_html()

sqyu/CorDiffViz documentation built on March 28, 2022, 5:47 a.m.