viz: Main function for estimating and writing self/differential...

View source: R/main.R

vizR Documentation

Main function for estimating and writing self/differential correlation matrices to local files.

Description

Main function for estimating and writing self/differential correlation matrices to local files.

Usage

viz(
  run_name,
  dat1X,
  dat2X,
  dat1Y = NULL,
  dat2Y = NULL,
  name_dat1 = "1",
  name_dat2 = "2",
  cor_names = c("pearson", "kendall", "spearman", "sin_kendall", "sin_spearman"),
  permutation = TRUE,
  alpha = 0.05,
  sides = 2,
  B = 1000,
  adj_method = "BY",
  parallel = FALSE,
  verbose = TRUE,
  make_plot = TRUE,
  perm_seed = NULL,
  Cai_seed = NULL,
  layout_seed = NULL
)

Arguments

run_name

A string, a given name for this run/function call. Files for visualization will be saved under file.path("dats", run_name). Examples include MyFirstData_run1, MyFirstData_run2, MySecondData_run1, where each call is run with different arguments to viz(), e.g. on different datasets or with different parameters.

dat1X

A matrix data for group X for the first sample; see details. Must not be NULL and must have the same number of columns as dat2X.

dat2X

A matrix data for group X for the second sample; see details. Must not be NULL and must have the same number of columns as dat1X.

dat1Y

Optional, a matrix data for group Y for the first sample and defaults to NULL; see details. If not NULL, must have the same number of rows as dat1X and same number of columns as dat2Y, and dat2Y must not be NULL.

dat2Y

Optional, a matrix data for group X for the second sample and defaults to NULL; see details. If not NULL, must have the same number of rows as dat2X and same number of columns as dat1Y, and dat1Y must not be NULL.

name_dat1

A string, name for the first sample. Defaults to "1".

name_dat2

A string, name for the second sample. Defaults to "2".

cor_names

A string or a vector of strings, name(s) of correlation types to be estimated. Must be chosen from "pearson", "kendall", "spearman", "sin_kendall", and "sin_spearman".

permutation

Logical, indicating whether permutation tests should be done in addition to parametric tests; defaults to TRUE.

alpha

Numerical, the significance level in hypothesis testing; defaults to 0.05. Used to produce the heat maps. This parameter does not affect the interactive visualization in the browser since the user can manually change the significance level there.

sides

A number 1, 2, 3 or a matrix containing 1, 2, 3. If a matrix, must be of size ncol(dat1X) x ncol(dat1X) if dat1Y is NULL, or ncol(dat1X) x ncol(dat1Y) otherwise. 2 stands for two-sided tests, 1 for one-sided test with null hypothesis being the corresponding entries >= 0 (the corresponding correlation for sample 1 stronger than that for sample 2), and 3 for one-sided test with null hypothesis being the corresponding entries <= 0.

B

An integer, the number of bootstrapping samples in permutation tests; defaults to 1000.

adj_method

A string, the method passed to stats::p.adjust for adjusting the p values for multiple testing; defaults to "BY". Must be one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", or "none".

parallel

A logical, whether to use parallel computing; may fail sometimes for some systems and defaults to FALSE.

verbose

A logical, whether to print progress; defaults to TRUE.

make_plot

A logical, whether to make heat maps and static graphs; defaults to TRUE. Plots will be made under file.path("plots", run_name).

perm_seed

A number, seed for permutation test; defaults to NULL.

Cai_seed

A number, seed for the method by Cai and Zhang; defaults to NULL.

layout_seed

A number, seed for the layout of the static graphs; defaults to NULL.

Details

Files created can be found under the current working directory.

To estimate the differential correlations under two conditions (1 and 2), dat1X and dat2X should contain data for conditions 1 and 2, respectively. For both dat1X and dat2X, each row should contain the measurements for one sample/observation/subject, and each column corresponds to one variable/covariate. dat1Y and dat2Y should be set to NULL.

To estimate the differential cross-correlations between variables in group X and variables in group Y under two conditions, dat1X and dat2X should contain data for conditions 1 and 2, respectively, whose columns correspond to variables in group X. Likewise, dat1Y and dat2Y should be non-NULL and contain measurements for variables in the Y group, under conditions 1 and 2, respectively.

If dat1Y and dat2Y are NULL, the function estimates the difference cor(dat1X) - cor(dat2X) and truncates to 0 the entries that are below a certain threshold determined by parameteric/permutation tests.

If dat1Y and dat2Y are not NULL, the difference in the cross-correlations cor(dat1X, dat1Y) - cor(dat2X, dat2Y) is estimated.

The dimensions must be as follows: dat1X has dimension n1 x pX, dat2X n2 x pX, and if provided, dat1Y n1 x pY and dat2Y n2 x pY. The column names will be used as names for each variable/covariate, and the row names will be used as identifier for each sample/observation/subject.

Value

Does not return anything, but instead creates relevant folders and files under the current working directory under file.path("dats", run_name) and file.path("plots", run_name). The folder plots contains static heat maps for the user, while the folder dats contains data files internally used by the interactive visualization HTML file.

Examples

dat0 <- read.csv(file.path(path.package("CorDiffViz"), "extdata/sample_data.csv"))
# First column of dat0 is the group (dat1 or dat2)
dat1 <- dat0[dat0$Group=="AA", 2:10][1:13,] # 13 x 9
dat2 <- dat0[dat0$Group=="BB", 2:10][1:15,] # 15 x 9
# Self correlations
viz(run_name="exmp_self", dat1X=dat1, dat2X=dat2, dat1Y=NULL, dat2Y=NULL,
    name_dat1="AA", name_dat2="BB", 
    cor_names=c("pearson","spearman", "kendall","sin_spearman","sin_kendall"),
    permutation=TRUE, alpha=0.05, sides=2, B=1000, adj_method="BY", verbose=TRUE,
    make_plot=TRUE, parallel=FALSE, perm_seed=1, Cai_seed=1, layout_seed=1)
# Correlations between variables in group X = {1:4} and variables in group Y = {5:9}
viz(run_name="exmp_XY", dat1X=dat1[,1:(ncol(dat1)/2)], dat2X=dat2[,1:(ncol(dat1)/2)], 
    dat1Y=dat1[,(ncol(dat1)/2+1):ncol(dat1)], dat2Y=dat2[,(ncol(dat1)/2+1):ncol(dat1)], 
    name_dat1="AA", name_dat2="BB", 
    cor_names=c("pearson","spearman", "kendall","sin_spearman","sin_kendall"), 
    permutation=TRUE, alpha=0.05, sides=2, B=1000, adj_method="BY", verbose=TRUE, 
    make_plot=TRUE, parallel=FALSE, perm_seed=1, layout_seed=1)
    
# Remove folders for the examples generated above
unlink(c("dats/exmp_self", "dats/exmp_XY", "plots/exmp_self", "plots/exmp_XY"), recursive=TRUE)
setup_js_html()

sqyu/CorDiffViz documentation built on March 28, 2022, 5:47 a.m.