GrammRServ: Graphical Representation without a GUI

Description Usage Arguments Value Author(s) References See Also Examples

Description

A non-GUI method to construct graphical representations of metagenomic count data. This function is recommended for large data sets and can be run as a background job when a user-interface is not available.

Usage

1
2
3
4
GrammRServ(Data = NULL, Cluster = NULL, DataType = "Counts", 
DistType = "Kendall's tau-distance", PhyTree = NULL, 
GunifType = NULL, GunifWeight = 0, Dim = c(2, 3, 4), 
LpNorm = c(1), Penalty = 0.5, MinClust = 2)

Arguments

Data

Data matrix consisting of one of the following two values:

  • (1) metagenomic counts with the rows of the matrix representing attributes to be clustered(can be samples or taxa).

  • (2) measure of dissimilarity between samples or taxa.

Cluster

(Optional) The vector whose length is equal to the number of rows of Data. Values in the vector provide the cluster membership of samples determined using their attributes.

DataType

A character variable corresponding to the type of values in Data. It takes values in c(“Counts”, “Distance”)

DistType

Measure of dissimilarity between samples to be used to calculate the distance matrix. It takes values in c(“Kendall's tau-distance”, “UniFrac”) and is used when the DataType is equal to Counts. The default value is “Kendall's tau-distance”.

PhyTree

A phylogenetic tree of class phylo to be used for calculating the UniFrac distance. This is to be provided only when DistType is set equal to “UniFrac”.

GunifType

The type of UniFrac distance to be specified when calculating the UniFrac distance using GUniFrac package. It takes values in c(“Unweighted”, “Variance Adjusted”, “Generalized”).

GunifWeight

The weight parameter used in calculation of Generalized UniFrac distance. The parameter takes values between 0 and 1. For more details, see Chen et.al.(2012).

Dim

Dimension of the multidimensional scaling model to be constructed. Default value is c(2,3,4).

LpNorm

A vector valued variable which determines the norm to be used in multidimensional scaling model calculation. The default value (equal to 1) corresponds to l_1-MDS model. Principal coordinate analysis (PCoA) is performed when the value is set to two.

Penalty

A numeric value between 0 and 1 which is used as penalty for ties in calculation of Kendall's tau-distance. Default value is 0.5.

MinClust

Minimum number of clusters to be used in PAM method for estimating the optimal number of clusters. Default value is 2.

Value

Separate directories are created in the current working directory for each model constructed using all possible combinations of dimension and l_p norm specified.

  1. Directories for the two dimensional models contain the average silhouette plot, true estimated model, model showing estimated clusters and (optional)model showing true clusters.

  2. Directories for models of dimension greater than two contain the average silhoutte plot and subdirectories for the true model, estimated clusters model and (optional)model showing true clusters.

For all models, a text file containing the estimated cluster membership is saved in the subdirectory corresponding to the model for future validation.

Author(s)

Deepak Nag Ayyala <deepaknagayyala@gmail.com>

References

Chen, J., et.al. (2012) Associating microbiome composition with environmental covariates using generalized UniFrac distance, Bioinformatics, 28(16).

See Also

GrammRGUI

Examples

1
2
3
4
data(metagencounts)
GrammRServ(Data = metagencounts$Counts, Cluster = metagencounts$CommMemshp, 
DataType = "Counts", DistType = "Kendall's tau-distance", 
Dim = c(2, 3, 4), LpNorm = c(1,2), Penalty = 0.5, MinClust = 2)

GrammR documentation built on May 1, 2019, 8:46 p.m.