ref_scores | R Documentation |
A list containing two dataframes. The same_writer dataframe contains similarity scores from same writer pairs. The diff_writer dataframe contains similarity scores from different writer pairs. The similarity scores are calculated from the validation dataframe with the following steps:
The absolute and Euclidean distances are calculated between pairs of writer profiles.
random_forest
uses the distances between the pair to predict the class of the pair
as same writer or different writer.
The proportion of decision trees that predict same writer is used as the similarity score.
ref_scores
A list with the following components:
A dataframe of 1,800 same writer similarity scores. The columns docname1 and writer1 record the file name and the writer ID of the first handwriting sample. The columns docname2 and writer2 record the file name and writer ID of the second handwriting sample. The match column records the class, which is same, of the pairs of handwriting samples. The similarity scores between the pairs of handwriting samples are in the score column.
A dataframe of 717,600 different writer similarity scores. The columns docname1 and writer1 record the file name and the writer ID of the first handwriting sample. The columns docname2 and writer2 record the file name and writer ID of the second handwriting sample. The match column records the class, which is different, of the pairs of handwriting samples. The similarity scores between the pairs of handwriting samples are in the score column.
summary(ref_scores$same_writer)
summary(ref_scores$diff_writer)
plot_scores(ref_scores)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.