similarities: similarities
In rscc: R Source Code Similarity Evaluation by Variable/Function Names

Description Usage Arguments Value Examples

sims and similarities both calculate for each pair of source code objects the similarity coefficients and return a data frame with the coefficients in descending order. A larger coefficient means a greater similarity.

sims(...)

similarities(
  docs,
  all = FALSE,
  coeff = c("jaccard", "braun", "dice", "hamann", "kappa", "kulczynski", "ochiai",
    "phi", "russelrao", "matching", "simpson", "sneath", "tanimoto", "yule")
)

`...`	all parameters in `sims` are given to `similarities`
`docs`	document object
`all`	logical: should the similarity coefficients computed based on all sourcecode objects or just the two considered (default: `FALSE`)
`coeff`	character: coefficient to compute (default: `"jaccard"`), abbreviations can be used

a data frame with the results

# example files are taken from https://CRAN.R-project.org/package=SimilaR
files <- list.files(system.file("examples", package="rscc"), "*.R$", full.names=TRUE)
prgs  <- sourcecode(files, basename=TRUE)
docs  <- documents(prgs)
similarities(docs)
# further steps
# m  <- similarities(docs)
# df <- matrix2dataframe(m)
# head(df, n=20)
# browse(prgs, df, n=5)