tfidfdistance: tfidfdistance

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Distance measure between Genes based on GeneOntology

Usage

1
tfidfdistance(NCBIs,GeneOntologyPath = GOdataDi('09Originale'), LRNfilename = 'GOAdirekt.lrn',Silent=T)

Arguments

NCBIs

[1:n] NCBI numbers, see ORA documentation

GeneOntologyPath

OPTIONAL, path of GeneOntology data base, see ORA documentation

LRNfilename

OPTIONAL, pfilename of GeneOntology data base, see ORA documentation

Silent

OPTIONAL, If =FALSE: print stepweis computations

Details

By using the inverse document frequency (idf) [Sparck Jones, 1972] a distance between these genes is defined. If all n Genes are manually annotated in the Geneontology, the distance matrix is a nxn matrix, in the other case smaller.

Details in [Thrun, 2016]

Value

List V with

Distance

[1:m,1:m] distancematrix, Only Between Genes which are manually curated in Gene Ontology: m=<n

Data

[1:m,1:k] Correct MDS transformation of Distance to Points

Gene2TermP

[1:m,1:l] Genes to GOterms matrix, see ORA documentation

NCBIsFound

[1:m] NCBI numbers found in GeneOntology, these genes are used in the distance matrix

AnnotedInFollowingGOterms[1:l] GoTerms were the Genes were manually annotated.

Author(s)

Michael Thrun

References

Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval, Journal of documentation, Vol. 28(1), pp. 11-21. 1972.

Thrun, M.C., A System for Projection Based Clustering through Self-Organization and Swarm Intelligence, (Doctoral dissertation), Philipps-Universität Marburg, Germany, 2016.

See Also

ORA package

Examples

1
2
3
4
5
# path2=ReDi('ChronificationGenes2016/09Originale')
# name2='SchmerzGene2016Jan535.names'
# NamesV=ReadNAMES(name2,path2)
# NCBI=NamesV$Key
# V=tfidfdistance(NCBI)

Mthrun/Distances documentation built on Feb. 4, 2020, 8:39 p.m.