Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/geneSimilarity.R

Calculate the pairwise functional similarities for a list of genes using different strategies.

1 | ```
getGeneSim(genelist1, genelist2=NULL, similarity="funSimMax", similarityTerm="relevance", normalization=TRUE, method="sqrt", avg=(similarity=="OA"), verbose=FALSE)
``` |

`genelist1` |
character vector of primary gene IDs according to organism annotation package (see |

`genelist2` |
optional other character vector of primary gene IDs to compare against |

`similarity` |
method to calculate the functional similarity between gene products |

`similarityTerm` |
method to compute the similarity of GO terms |

`normalization` |
normalize similarities yes/no |

`method` |
"sqrt": normalize sim(x,y) <- sim(x,y)/sqrt(sim(x,x)*sim(y,y)); "Lin": normalize sim(x,y) <- 2*sim(x,y)/(sim(x,x) + sim(y,y)); "Tanimoto": normalize sim(x,y) <- sim(x,y)/(sim(x,x) + sim(y,y) - sim(x,y)). NOTE: normalization does not have any effect, if term similarity is NOT "relevance" and similarity = "funSimMax", "funSimAvg" or similarity = "OA" and avg=TRUE |

`avg` |
standardize the OA kernel by the maximum number of GO terms for both genes |

`verbose` |
print out some information |

The method to calculate the pairwise functional similarity between gene products can either be:

- "max"
the maximum similarity between any two GO terms

- "mean"
the average similarity between any two GO terms

- funSimMax
the average of best matching GO term similarities. Take the maximum of the scores achieved by assignments of GO terms from gene 1 to gene 2 and vice versa. [2]

- funSimAvg
the average of best matching GO term similarities. Take the average of the scores achieved by assignments of GO terms from gene 1 to gene 2 and vice versa. [2]

- "OA"
the optimal assignment (maximally weighted bipartite matching) of GO terms associated to the gene having fewer annotation to the GO terms of the other gene. [1]

- "hausdorff"
Hausdorff distance between two sets: Let X and Y be the two sets of GO terms associated to two genes. Then

*dist(X,Y) = \max\{\sup_{t \in X} \inf_{t' \in Y} d(t,t'), \sup_{t' \in Y} \inf_{t \in X} d(t,t')*[3]. Since GOSim 1.2.8 we translate the Haussdorff distance into a similarity measure by taking*sim(X,Y) = \exp(-dist(X,Y)*.- "dot"
the dot product between feature vectors describing the absence/presence of each GO term. The absence of each GO term is weighted by its information content. Depending on the type of later normalization one can arrive at the cosine similarity (method="sqrt") or at the Tanimoto coefficient (method="Tanimoto").[4]

n x n similarity matrix (n = number of genes)

The result depends on the currently set ontology.

Holger Froehlich

[1] H. Froehlich, N. Speer, C. Spieth, A. Zell, Kernel Based Functional Gene Grouping, Proc. Int. Joint Conf. on Neural Networks (IJCNN), 6886 - 6891, 2006.

[2] A. Schlicker, F. Domingues, J. Rahnenfuehrer, T. Lengauer, A new measure for functional similarity of gene products based on Gene Ontology, BMC Bioinformatics, 7, 302, 2006.

[3] A. del Pozo, F. Pazos, A. Valencia, Defining functional distances over Gene Ontology, BMC Bioinformatics, 9:50, 2008.

[4] M. Mistry, P Pavlidis, Gene Ontology term overlap as a measure of gene functional similarity, BMC Bioinformatics, 9:327, 2008.

`getGeneSimPrototypes`

, `getTermSim`

, `setOntology`

1 | ```
# see evaluateClustering
``` |

```
Loading required package: GO.db
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, sd, var, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colMeans, colSums, colnames, do.call,
duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
lapply, lengths, mapply, match, mget, order, paste, pmax, pmax.int,
pmin, pmin.int, rank, rbind, rowMeans, rowSums, rownames, sapply,
setdiff, sort, table, tapply, union, unique, unsplit, which,
which.max, which.min
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: IRanges
Loading required package: S4Vectors
Attaching package: 'S4Vectors'
The following object is masked from 'package:base':
expand.grid
Loading required package: annotate
Loading required package: XML
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.