xina_clustering: xina_clustering
In XINA: Multiplexes Isobaric Mass Tagged-based Kinetics Data for Network Analysis

Description Usage Arguments Value Examples

Clustering multiplexed time-series omics data to find co-abundance profiles

1 2	xina_clustering(f_names, data_column, out_dir = getwd(), nClusters = 20, norm = "sum_normalization", chosen_model = "")

`f_names`	A vector containing input file (.csv) paths
`data_column`	A vector containing column names (1st row of the input file) of data matrix
`out_dir`	A directory path for saving clustering results. (default: out_dir=getwd())
`nClusters`	The number of desired maximum clusters
`norm`	Default is "sum_normalization". Sum-normalization is to divide the data matrix by row sum. If you want to know more about sum-normalization, see https://www.ncbi.nlm.nih.gov/pubmed/19861354. "zscore" is to calculate Z score for each protein. See scale.
`chosen_model`	You can choose a specific model rather than testing all the models that are available in mclust. mclustModelNames If you want k-means clustering instead of the model-based clustering, use "kmeans" here.

a plot containing a BIC plot in current working directory and a list containing below information:

Item	Description
clusters	XINA clustering results
aligned	XINA clustering results aligned by ID
data_column	Data matrix column names
out_dir	The directory path containing XINA results
nClusters	The number of clusters desired by user
max_cluster	The number of clusters optimized by BIC
chosen_model	The used covariance model for model-based clustering
optimal_BIC	BIC of the optimized covariance model
condition	Experimental conditions of the user input data
color_for_condition	Colors assigned to each experimental conditions which is used for condition composition plot
color_for_clusters	Colors assigned to each clusters which is used for XINA clustering plot
norm_method	Used normalization method

# Generate random multiplexed time-series data
random_data_info <- make_random_xina_data()

# Data files
data_files <- paste(random_data_info$conditions, ".csv", sep='')

# time points of the data matrix
data_column <- random_data_info$time_points

# mclust requires the fixed random seed to get reproduce the clustering results
set.seed(0)

# Run the model-based clustering to find co-abundance profiles
example_clusters <- xina_clustering(data_files, data_column=data_column,
nClusters=30)

# Run k-means clustering to find co-abundance profiles
example_clusters <- xina_clustering(data_files, data_column=data_column,
nClusters=30,
chosen_model="kmeans")