Description Usage Arguments Details Note Author(s) References See Also Examples
It constructs a crosstabulate matrix of source language words vs target language words of a given sentence pair to be filled by an expert (Sure|Possible : 1|2) or based on an external word alignment software (3).
1 2 3 |
tst.set_sorc |
the name of source language file in test set. |
tst.set_trgt |
the name of target language file in test set. |
nrec |
number of sentences to be read. If -1, it considers all sentences. |
method |
it consists of two arguments. If "gold", it considers the message corresponding to gold standard (i.e. "Now, press 'Enter' to continue and edit the matrix to enter Sure/Possible alignments (Sure=1,Possible=2)"). If "aligns", it considers the message corresponding to another alignment (i.e. "Now, press 'Enter' to continue and edit the matrix to enter '3' for alignments"). |
minlen |
a minimum length of sentences. |
maxlen |
a maximum length of sentences. |
ul_s |
logical. If TRUE, it will convert the first character of source language's sentences. When source language is a right-to-left, it can be FALSE. |
ul_t |
logical. If TRUE, it will convert the first character of target language's sentences. When target language is a right-to-left, it can be FALSE. |
num |
an integer. The number of which sentence pair that we want to crosstab its matrix. |
If we want to evaluate our word alignment results, the matrix that is constructed by this function will be filled by an expert with codes 1 or 2 for Sure or Possible alignments, while if we want to evaluate alignment based on an external word alignment software or even another method, this matrix is filled by an expert with code 3.
In case of non-ascii problem, you can use consExcel function instead.
Neda Daneshgar and Majid Sarmad.
Holmqvist M., Ahrenberg L. (2011), "A Gold Standard for English-Swedish Word Alignment.", NODALIDA 2011 Conference Proceedings, 106 - 113.
Och F., Ney H.(2003), "A Systematic Comparison Of Various Statistical Alignment Models.", 2003 Association for Computational Linguistics, J03-1002, 29(1).
consExcel
1 2 3 4 5 6 7 8 | ## Not run:
fix.gold ('http://www.um.ac.ir/~sarmad/word.a/source1.txt',
'http://www.um.ac.ir/~sarmad/word.a/target1.txt',
nrec = 5, num = 3)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.