18.119.132.223
18.119.132.223
close menu
Automatic Acquisition of Class-based Rules for Word Alignment
( Sur Jin Ker ) , ( Jason J. S. Chang )
국제 워크샵 1995권 173-183(11pages)
UCI I410-ECN-0102-2015-700-001901868

In this paper, we describe an algorithm for aligning words with their translation in a bilingual corpus. Existing algorithms require enormous bilingual data to train statistical word-to-word translation models. Using word-based approach, frequent words with consistent translation can be aligned at a high precision rate. However, less frequent words or words with diverse translations usually do not have statistically significant evidence for confident alignment. Incomplete or incorrect alignments consequently result. Our algorithm attempts to handle the problem using a hierarchical class-based approximation of translation probabilities. The translation probabilities are estimated using class-based models on 3 levels of specificity. We found that the algorithm can provide translation probability for more word pairs at the cost of slightly lower degree of precision, even when a small corpus was used in training. We have achieved an application rate of 81.8% and precision rate of93.3%. The algOrithm also offer the advantage of producing word-sense disambiguation information.

[자료제공 : 네이버학술정보]
×