CV.Signature.TCP: Cardiovascular Signature Temporal Clustering Platform...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/CV.Signature.TCP.R

Description

Denoise, classify, and evaluate variables (biomarkers) from time course data such as proteomics and other high-throughput technologies.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
CV.Signature.TCP(
  dat,
  timepoints = NULL,
  center.dat = TRUE,
  scale.dat = FALSE,
  denoise = c("smooth.spline", "pca", "none"),
  denoise.parameter = c("cv", "cv.global"),
  dist.method = c("euclidean", "cor.diss", "dtw"),
  cluster.method = c("kmeans", "hclust"),
  K,
  evaluate = TRUE,
  verbose = FALSE,
  seed = NULL
)

Arguments

dat

a data matrix with m biomarkers as rows, over n time points (columns).

timepoints

a vector of time points for columns of dat.

center.dat

a logical specifying to center the input and denoised data. By default, TRUE.

scale.dat

a logical specifying to scale the input and denoised data. By default, FALSE.

denoise

a denoising method. By default, fitting a cubic spline.

denoise.parameter

a parameter for a denoising method, such as the degree of freedom in spline.smooth, the number of significant PCs in PCA.

dist.method

a distance method for time course data, resulting in a m * m distance matrix for rows. 'dtw' for dynamic time wrapping or 'cor.diss' for correlation-based dissimilarities.

cluster.method

a clustering method.

K

a number of clusters.

evaluate

a logical specifying to evaluate the cluster membership with the jackstraw tests. By default, FALSE.

verbose

a logical specifying to print the computational progress. By default, FALSE.

seed

a seed for the random number generator.

...

optional arguments.

Details

This function combines multiple steps. For more options and fine-tuning, please use individual functions in 'CV.Signature.TCP' package. This attempts to identify temporal dynamics, by clustering denoise and/or time-wrapped data. This requires the user to input the data (dat) where the rows and columns are variables (e.g., genes, proteins) and observations taken at different time points, respectively. Correspondingly, timepoints is a vector of actual time points (e.g., hours, days) corresponding to the columns of dat.

This function goes through the following steps:

This work is motivated by identifying reliable molecular signatures from time-series proteomics data of optm occupancies in the cardiovascular mouse model (see Wang et al. (2018)) Last but not least, modeling and classifying high-dimensional temporal data is notoriously challenging. This package aim to provide an analysis pipeline that is relatively robust and non-parametric, while accounting for typical -omic study involving complex phenotypes. For further implementations of related methods, see TSclust and TSdist.

Value

CV.Signature.TCP returns a list consisting of

denoised

m * n denoised data.

dat.dist

m * m distance matrix. Only returns when using hclust

cluster.obj

an object returned from clustering the denoised data.

membership

a vector of length m, identities of clusters.

evaluated

an object returned from applying the jackstraw tests for clusters.

Author(s)

Neo Christopher Chung nchchung@gmail.com

References

Identifying temporal molecular signatures underlying cardiovascular diseases. In preparation.

J Wang, H Choi, NC Chung, Q Cao, DCM Ng, B Mirza, SB Scruggs, D Wang, AO Garlid, P Ping (2018). Integrated dissection of the cysteine oxidative post-translational modification proteome during cardiac hypertrophy. Journal of Proteome Research.

NC Chung (2020). Statistical significance of cluster membership for unsupervised evaluation of single cell identities. Bioinformatics

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
## Not run: 
data(cys_optm)
meta <- cys_optm[,1:4]
optm <- log(cys_optm[meta$Select,5:10])
optm <- t(scale(t(optm), scale=TRUE, center=TRUE))
days <- as.numeric(colnames(optm))

output <- CV.Signature.TCP(optm,
                timepoints = days,
                center.dat = TRUE,
                scale.dat = TRUE,
                denoise = c("smooth.spline"),
                denoise.parameter=c("cv"),
                dist.method = "cor.diss",
                cluster.method = c("kmeans"),
                K = 5,
                evaluate = TRUE,
                verbose = TRUE,
                seed = 1
               )

# see the elbow plot
cluster.elbow(dat=output$denoised, FUNcluster=kmeans, method="wss", k.max=10, linecolor="black")

# make the cluster figure
optm.fig <- vis_cluster(output$denoised, group=output$membership)

# to modify/polish the figure (ggplot2 object)
optm.fig <- optm.fig + labs(y="Log-transformed Occupancy Ratio", x="Time (day)", title="All O-PTMs") + ylim(-2,2) + facet_wrap(~ cluster,nrow=1,ncol=6)

# filter the data based on jackstraw PIP and make a figure
library(jackstraw)
optm.pip <- pip(output$evaluated$p.F, pi0=sum(output$evaluated$p.F > .05)/length(output$evaluated$p.F))
hist(optm.pip,100,col="black")
optm.pip.fig <- vis_cluster(output$denoised[optm.pip > .9,], group=output$membership[optm.pip > .9])
optm.pip.fig <- optm.pip.fig + labs(y="Log-transformed Occupancy Ratio", x="Time (day)", title="O-PTMs with PIP > 0.9") + ylim(-2,2) + facet_wrap(~ cluster,nrow=1,ncol=6)

library(cowplot)
optm.fig / optm.pip.fig

## End(Not run)

UCLA-BD2K/CV.Signature.TCP documentation built on May 15, 2020, 11:27 p.m.