Description Usage Arguments Value Examples
This function helps to convert transliterated Cyrillic to original Cyrillic.
1 2 3 4 5 6 7 8 9 10 |
mdat |
character vector to be back-transliterated to Cyrillic. |
tolanguage |
language the text needs to be converted to ("Russian" by default) |
LAOR |
rules of tranliteration from transliterated Cyrillic to original Cyrillic (the rules are listed in the file "transliterationLAOR.csv"). |
OROR |
rules to correct transliterated original Cyrillic (the rules are listed in the file "transliterationOROR.csv"). |
EnglishDetection |
if set to TRUE, the script avoids transliteration of words found in the English vocabulary (file: english.txt). If set to FALSE, only user defined stop words are used (file: stopwordsfile.csv). |
EnglishLength |
threshold is set to ignore EnglishDectection words below given threshold. |
RussianCorrection |
if set to TRUE, the script attempts to match every back-transliterated word with the Russian vocabulary (files: russian.txt and russian_surnames.txt). |
SensitivityThreshold |
is used only if RussianCorrection==TRUE. It determines algorithm's sensitivity to mismatches (numbers closer to 0 define higher sensitivity to mismatches). SensitivityThreshold is set to 0.1 by default. |
Returns the vector of transliterated characters in Cyrillic.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | library(HooverArchives)
# conversion to Russian
dat<-c("Mezhdunarodnaia gazeta. Gl. redaktor: Iu. Zarechkin. Moscow, Russia. Semiweekly. 199?",
"DEN' UCHITELIA komissiia po obrazovaniiu ob''edineniia Iabloko",
"III-ii RIM vestnik Rossiiskogo patrioticheskogo dvizheniia. Redaktory: M. Artem'ev, V. Rugich. Moscow, Russia.")
converteddata_ru <- fromLATtoCYR(dat, LAOR=TRUE, OROR=FALSE, EnglishDetection=TRUE)
# conversion to Ukrainian
dat<-read.csv(system.file("Ukraine_microform.csv", package="HooverArchives"),
sep=",", encoding = "UTF-8", stringsAsFactors = FALSE)
converteddata_uk <- fromLATtoCYR(dat$FIELD.245, tolanguage="Ukrainian")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.