An Implementation of a Multilingual Regular Expression Segmentor for Ordinary and Morphologically Rich Lexical Tokens

Paul Wu Hrng Jyh; Kevin Cheong

한국언어정보학회 국제 워크샵 An Implementation of a Multilingual Regular Expression Segmentor for Ordinary and Morphologically Rich Lexical Tokens

( Paul Wu Hrng Jyh ) , ( Kevin Cheong )

한국언어정보학회 1995.01

국제 워크샵 1995권 267-271(5pages)

UCI I410-ECN-0102-2015-700-001894695

인용하기 URL 복사 보관함 담기

미리보기

초록

Lexical pattern matching and text extraction is an essential component of many Natural Language Processing applications. Following the language hierarchy first conceived hy Chomsky, it is commonly accepted that simple phrasal patterns should be categorised under the class of Regular Language (RL). There are :3 operations in RL - Union, Concatenation and Kleene Closure - which are applied to a finite lexicon. The machinery that recognises RL is the Finite State Machine (FSM). This paper discusses and postulates that the degree with which a class of patterns exercise the aspects of RL operators, is directly proportional to the richness inll1orphology of lexical tokens.

키워드

참고문헌 (0)

[자료제공 : 네이버학술정보]