Description Usage Arguments Details Value Note Author(s) References See Also Examples
It creates an excel file from two sentences of two languages to help the user for constructing a gold standard such that he/she can set 1 or 2 for sure or possible alignments.
1 2 3 4 |
tst.set_sorc |
the name of source language file in test set. |
tst.set_trgt |
the name of target language file in test set. |
method |
it consists of two arguments. If If "gold", it creates a separated excel file of test set to fill up its sheets with 1|2 for sure|possible alignment. If "aligns", it creates a separated excel file of test set to fill up its sheets with '3' as an alignment. |
out1 |
the name of the excel file for gold standard. |
out2 |
the name of the excel file for alignment. |
nrec |
number of sentences to be read. If -1, it considers all sentences. |
minlen |
a minimum length of sentences. |
maxlen |
a maximum length of sentences. |
ul_s |
logical. If TRUE, it will convert the first character of source language's sentences. When source language is a right-to-left, it can be FALSE. |
ul_t |
logical. If TRUE, it will convert the first character of target language's sentences. When target language is a right-to-left, it can be FALSE. |
The first step for evaluation of word alignment quality is creating a gold standard. This function makes an excel file with nrec sheets of a test set consists of source and target languages. Each sheet consists of the words of the target sentence as its rows and the words of the source sentence as its columns. To create a gold standard, it can be filled by Sure/Possible alignments (Sure = 1, Possible = 2).
Sometimes, the user calculates word alignments using another software or even another method and he/she wants to evaluate such alignment with this package. So, this function can help him/her in this way, it creates a separated excel file in "out2.xlsx" (as a default: "align.xlsx") and it can be filled by number 3 for alignments.
One or two excel file in "out1" or "out2" file.
If you have not the non-ascii problem, you can use fix.gold function instead.
Neda Daneshgar and Majid Sarmad.
Holmqvist M., Ahrenberg L. (2011), "A Gold Standard for English-Swedish Word Alignment.", NODALIDA 2011 Conference Proceedings, 106 - 113.
Och F., Ney H.(2003), "A Systematic Comparison Of Various Statistical Alignment Models.", 2003 Association for Computational Linguistics, J03-1002, 29(1).
fix.gold
1 2 3 4 5 6 7 8 9 10 11 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.