GenePairs: GenePairs
In KhiabanianLab/TuBA: Performs Biclustering On Large Gene Expression Data Sets

Description Usage Arguments Value Examples

View source: R/TuBA.R

This function finds gene-pairs that share more than a specified proportion of samples between their percentile sets.

1	GenePairs(File, PercSetSize, JcdInd, highORlow, SampleFilter = NULL)

`File`	A character variable. Specifies the name of the .csv or .txt file that contains the normalized counts.
`PercSetSize`	A numeric variable. Specifies the percentage of samples that should be in the percentile sets (strictly greater than 0 and less than 40).
`JcdInd`	A numeric variable. Specifies the minimum Jaccard Index for the overlap between the percentile sets of a given gene-pair.
`highORlow`	A character variable. Specifies whether the percentile sets correspond to the highest expression samples ("h") or the lowest expression samples ("l").
`SampleFilter`	A logical variable. If TRUE, filters out samples over-represented in percentile sets. Default is FALSE.

A data frame containing the gene-pairs whose Jaccard indices are greater than the specified threshold (JcdInd). Instead of gene symbols their serial numbers in the input gene expression file are used in order to save space. In addition, this function generates 2 .csv files in the working directory - one contains the gene-pairs and their Jaccard indices, while the other contains the binary matrix (genes along rows, samples along columns) in which the presence of a sample in the percentile set of each gene is indicated by a 1. These 2 files are needed as inputs for the Biclustering function.

## Not run: 
# For high expression
GenePairs(File = "RPGenes.csv",PercSetSize = 5,JcdInd = 0.2,highORlow = "h")
# For low expression
GenePairs(File = "RPGenes.csv",PercSetSize = 5,JcdInd = 0.2,highORlow = "l")

## End(Not run)