test: A Test Set of Cluster Fill Rates

testR Documentation

A Test Set of Cluster Fill Rates

Description

Writers from the CSAFE Handwriting Database and the CVL Handwriting Database were randomly assigned to train, validation, and test sets.

Usage

test

Format

A dataframe with 332 rows and 43 variables:

docname

The file name of the handwriting sample.

writer

Writer ID. There are 83 distinct writer ID's. Each writer has four documents in the dataframe.

doc

The name of the handwriting prompt.

total_graphs

The total number of graphs in the document.

cluster1

The proportion of graphs in cluster 1

cluster2

The proportion of graphs in cluster 2

cluster3

The proportion of graphs in cluster 3

cluster4

The proportion of graphs in cluster 4

cluster5

The proportion of graphs in cluster 5

cluster6

The proportion of graphs in cluster 6

cluster7

The proportion of graphs in cluster 7

cluster8

The proportion of graphs in cluster 8

cluster9

The proportion of graphs in cluster 9

cluster10

The proportion of graphs in cluster 10

cluster11

The proportion of graphs in cluster 11

cluster12

The proportion of graphs in cluster 12

cluster13

The proportion of graphs in cluster 13

cluster14

The proportion of graphs in cluster 14

cluster15

The proportion of graphs in cluster 15

cluster16

The proportion of graphs in cluster 16

cluster17

The proportion of graphs in cluster 17

cluster18

The proportion of graphs in cluster 18

cluster19

The proportion of graphs in cluster 19

cluster20

The proportion of graphs in cluster 20

cluster21

The proportion of graphs in cluster 21

cluster22

The proportion of graphs in cluster 22

cluster23

The proportion of graphs in cluster 23

cluster24

The proportion of graphs in cluster 24

cluster25

The proportion of graphs in cluster 25

cluster26

The proportion of graphs in cluster 26

cluster27

The proportion of graphs in cluster 27

cluster28

The proportion of graphs in cluster 28

cluster29

The proportion of graphs in cluster 29

cluster30

The proportion of graphs in cluster 30

cluster31

The proportion of graphs in cluster 31

cluster32

The proportion of graphs in cluster 32

cluster33

The proportion of graphs in cluster 33

cluster34

The proportion of graphs in cluster 34

cluster35

The proportion of graphs in cluster 35

cluster36

The proportion of graphs in cluster 36

cluster37

The proportion of graphs in cluster 37

cluster38

The proportion of graphs in cluster 38

cluster39

The proportion of graphs in cluster 39

cluster40

The proportion of graphs in cluster 40

Details

The test dataframe contains cluster fill rates for 332 handwritten documents from the CSAFE Handwriting Database and the CVL Handwriting Database. The documents are from 83 writers. The CSAFE Handwriting Database has nine repetitions of each prompt. Two London Letter prompts and two Wizard of Oz prompts were randomly selected from each writer. The CVL Handwriting Database does not contain multiple repetitions of prompts and four Engligh language prompts were randomly selected from each writer.

The documents were split into graphs with process_batch_dir. The graphs were grouped into clusters with get_clusters_batch. The cluster fill counts were calculated with get_cluster_fill_counts. Finally, get_cluster_fill_rates calculated the cluster fill rates.

Source

https://forensicstats.org/handwritingdatabase/, https://cvl.tuwien.ac.at/research/cvl-databases/an-off-line-database-for-writer-retrieval-writer-identification-and-word-spotting/


handwriterRF documentation built on April 4, 2025, 5:38 a.m.