runGimm: This function prepares parameters for running executable...

Description Usage Arguments Details Note Author(s) See Also

Description

Function to complete all the processes related to gimm. By calling this function, all the operations regarding gimm can be performed, but all the other processes related to posthoc or post-posthoc will not be executed.

User must provide the values of "M", "T". Either "tableData" or "dataFile" or both must be provided.

Usage

1
2
3
4
5
runGimm (tableData, dataFile, M, T, nChip=0, nIter=10000, nreplicates=1,
				 contextSpecific="n", nContexts=1, contextLengths=NA,
				 estimate_contexts="n", clusterShape="v", burnIn=5000, 
				 elipticalWithin="n", verbose=FALSE, intFiles=FALSE,
				 clientID=-1, host=-1, port=-1, priorFile=NULL)

Arguments

tableData

Data frame with gene expression data to be clustered. The data frame is assumed to have T*nreplicates rows and M+2 columns. Rows represent genes and columns microarray experiments. First two columns are assumed to provide gene annotations such as the GeneID's and GeneNames. When replicated data is clustered (nreplicates>1), experimental replicates are placed in subsequent rows. For example, rows 1 to nreplicates correspond to gene 1, rows nreplicates+1 to 2*nreplicates to gene 2, etc. Annotations in the first two columns should be identical for replicated observations on the same gene. When using context-specific clustering, experiments should be organized in such a way that columns 3 to contexLentghs[1]+2 correspond to experiments within context 1, contextLengths[1]+3 to contextLengths[1] + contextLengths[2]+2 correspond to context 2 etc. When nChip>0 the program expects nChip columns of ChIP-chip data to follow the expression data. Each column corresponds to the binding affinities for one transcripotion factor. In this case, the program fits the Expression-ChIP Infinite Mixture model.

dataFile

When tableData is not specified, this variable gives the path to the tab-delimited text file containing data to be clustered. The data should be organized in the exactly the same way as in the tableData data frame except that the first row specifies column names. After completion, the the dataFile.cdt and dataFile.gtr will be created defining the hieararchical clustering. When both tableData and dataFile are specified, the dataFile will be used only as a name for creating .cdt and .gtr file. If this parameter is not provided, results will be saved in Result.cdt and Result.gtr within the current working directory.

M

Dimensionality of the expression vectors to be clustered. I.e. the number of microarray.

T

Number of genes to be clustered.

nChip

Number of ChIP-chip experiments (zero is the default).

nIter

Number of iterations to be generated by the Gibbs sampler

nreplicates

Number of experimental replicates.

contextSpecific

This parameter specifies whether the context-specific model to be used. If set to "y", the nContexts and contextLengths need also to be specified.

nContexts

Number of contexts to be used in the context-specific clustering. If nContexts>1 is specified, the context-specific model is assumed.

contextLengths

Vector specifying lengths of the contexts in the context-specific clustering. It has to satisfy following equality sum(contextLenghts)==M

estimate_contexts

"y" the context estimation GIMM is run. "n" the fixed contexts model or simple GIMM is run.

clusterShape

"v" requests the model with different variances for different clusters. "e" assumes equal variances for all clusters.

burnIn

Number of Gibbs sampler iterations to be discarded as "burn-in".

elipticalWithin

This parameter indicates the structure of covariance matrix in model. The covariance matrix defines the correlation of observations from different experimental conditions. For experimental design model, its value is "d", for compound symmetry model, its value is "c" while for unstructured model; its value is "n".

verbose

If true, all the internal cmments of the executables will be displayed on console.

intFiles

If true, the internal files generated by the executables gimm and posthoc won't be deleted.

clientID

The clientID that identifies the specific client to which the progress feedbacks are directed. This should be used only if feedback about progress need to be sent over network sockets.

host

The hostname for the computer to which progress feedbacks should be sent. This should be used only if feedback about progress need to be sent over network sockets.

port

The port number to which progress feedbacks should be sent. This should be used only if feedback about progress need to be sent over network sockets.

priorFile

This parameter specify the file name with path for prior information file. Two columns for GeneID and GeneName are essential. The prior information file should also contain either a vector for categorical prior clustering or a pair-wise similarity matrix for all genes. Genes contained in prior information file are required to be the same as in the dataFile or tableData. Once the prior information is correctly set, the prior model will be used to estimate clustering.

Details

This function is designed to execute the processes realted to "gimm" only. It makes call to function "processGimmParas()" using which parameter file called "parameters.prm" are written. Then system call is made to executable "gimm". User is supposed to provide the directory path in which these executables are residing. This can be done by changing the objects ".GimmPath" in the library. User must provide the values of "dataFile", "M", "T", otherwise the errors occur.

In order to complete all the processings for Bayesian Clusttering algorithm, call to this function should be followed by call to function "runPosthoc" and "finalProcess".

If the tableData object is available, but dataFile is missing then the default file name will be "Result". This file will be used to deliver the results of execution of gimm and posthoc via .cdt and .gtr files.

This function returns a list of parameters and data which can be later used in functions "runPosthoc" and "finalProcess".

Note

For additional information visit http://eh3.uc.edu/gimm

Author(s)

Vinayak Kumar, Mario Medvedovic

See Also

runPosthoc


uc-bd2k/gimmR documentation built on May 3, 2019, 2:15 p.m.