DUSTer: A Method for Unraveling Cross-Language Divergences for Statistical Word-Level Alignment
Title | DUSTer: A Method for Unraveling Cross-Language Divergences for Statistical Word-Level Alignment |
Publication Type | Book Chapters |
Year of Publication | 2002 |
Authors | Dorr BJ, Pearl L, Hwa R, Habash N |
Editor | Richardson S |
Book Title | Machine Translation: From Research to Real UsersMachine Translation: From Research to Real Users |
Series Title | Lecture Notes in Computer Science |
Volume | 2499 |
Pagination | 31 - 43 |
Publisher | Springer Berlin / Heidelberg |
ISBN Number | 978-3-540-44282-0 |
Abstract | The frequent occurrence of divergenceS —structural differences between languages—presents a great challenge for statistical word-level alignment. In this paper, we introduce DUSTer, a method for systematically identifying common divergence types and transforming an English sentence structure to bear a closer resemblance to that of another language. Our ultimate goal is to enable more accurate alignment and projection of dependency trees in another language without requiring any training on dependency-tree data in that language. We present an empirical analysis comparing the complexities of performing word-level alignments with and without divergence handling. Our results suggest that our approach facilitates word-level alignment, particularly for sentence pairs containing divergences. |
URL | http://dx.doi.org/10.1007/3-540-45820-4_4 |