Automatic Acquisition of Class-based Rules for Word Alignment

Jason J. S. Chang; Sur Jin Ker

한국언어정보학회 국제 워크샵 Automatic Acquisition of Class-based Rules for Word Alignment

( Sur Jin Ker ) , ( Jason J. S. Chang )

한국언어정보학회 1995.01

국제 워크샵 1995권 173-183(11pages)

UCI I410-ECN-0102-2015-700-001901868

인용하기 URL 복사 보관함 담기

미리보기

초록

In this paper, we describe an algorithm for aligning words with their translation in a bilingual corpus. Existing algorithms require enormous bilingual data to train statistical word-to-word translation models. Using word-based approach, frequent words with consistent translation can be aligned at a high precision rate. However, less frequent words or words with diverse translations usually do not have statistically significant evidence for confident alignment. Incomplete or incorrect alignments consequently result. Our algorithm attempts to handle the problem using a hierarchical class-based approximation of translation probabilities. The translation probabilities are estimated using class-based models on 3 levels of specificity. We found that the algorithm can provide translation probability for more word pairs at the cost of slightly lower degree of precision, even when a small corpus was used in training. We have achieved an application rate of 81.8% and precision rate of93.3%. The algOrithm also offer the advantage of producing word-sense disambiguation information.

키워드

참고문헌 (0)

[자료제공 : 네이버학술정보]