It is a function to create the cross tables of the source language words vs the target language words of sentence pairs as the gold standard or as the alignment matrix of another software. For the gold standard, the created cross table is filled by an expert. He/she sets '1' for Sure alignments and '2' for Possible alignments in cross between the source and the target words. For alignment results of another software, '1' in cross between each aligned source and target words is set by the user.
It works with two formats:
For the first format, it constructs a cross table of the source language words vs the target language words of a given sentence pair. Then, after filling as mentioned above sentence by sentence, it builds a list of cross tables and finally, it saves the created list as "file.align.RData".
In the second format, it creates an excel file with
n sheets. Each sheet includes a cross table of the two language words related each sentence pair. The file is as "file.align.xlsx". The created file to be filled as mentioned above.
1 2 3 4
Further agguments to be passed to
a character string including two options.For
the output file name.
an RData object as "file.align.RData" or an excel file as "file.align.xlsx".
If you have not the non-ascii problem, you can set
If ypu assign
'excel', it is necessary to bring two notes into consideration. The first note is that in order to use the created excel file for
evaluation function, don't forget to use
excel2rdata function to convert the excel file into required R format. The second note focouses on this:
ocassionally, there is a problem with 'openxlsx' package which is used in the function and it might be solved by 'installr::install.rtools() on Windows'.
Neda Daneshgar and Majid Sarmad.
Holmqvist M., Ahrenberg L. (2011), "A Gold Standard for English-Swedish Word Alignment.", NODALIDA 2011 Conference Proceedings, 106 - 113.
Och F., Ney H.(2003), "A Systematic Comparison Of Various Statistical Alignment Models.", 2003 Association for Computational Linguistics, J03-1002, 29(1).
1 2 3 4 5 6 7 8 9 10 11
## Not run: cross.table('http://www.um.ac.ir/~sarmad/word.a/euro.bg', 'http://www.um.ac.ir/~sarmad/word.a/euro.en', n = 10, encode.sorc = 'UTF-8') cross.table('http://www.um.ac.ir/~sarmad/word.a/euro.bg', 'http://www.um.ac.ir/~sarmad/word.a/euro.en', n = 5, encode.sorc = 'UTF-8', out.format = 'excel') ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.