pooled_similarity | R Documentation |
Precomputed kanji distances
pooled_similarity
A tibble containing kanji similarity judgments by 3 "native or native-like" speakers of Japanese. For each row, the pivot kanji was compared to a list of potential distractors. From the distractors, the subjects selected one character which they found particularly easy to confuse with the pivot. For the exact methodology, see the original study referenced below.
Datasets from https://lars.yencken.org/datasets, made available under the Creative Commons Attribution 3.0 Unported licence.
Collected as part of Yencken, Lars (2010) Orthographic support for passing the reading hurdle in Japanese. PhD Thesis, University of Melbourne, Melbourne, Australia.
Yencken, Lars, & Baldwin, Timothy (2008). Measuring and predicting orthographic associations: Modelling the similarity of Japanese kanji. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 1041-1048.
# Get kanji characters that were found to be easily confused with \u5927.
pooled_similarity[pooled_similarity$selected == "\u5927", ]$pivot
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.