글로버메뉴 바로가기 본문 바로가기 하단메뉴 바로가기

논문검색은 역시 페이퍼서치

> 한국언어정보학회 > 언어와 정보 > 22권 1호

Phraseological Analysis of Learner Corpus Based on Language Model

Phraseological Analysis of Learner Corpus Based on Language Model

( Sanghoun Song )

- 발행기관 : 한국언어정보학회

- 발행년도 : 2018

- 간행물 : 언어와 정보, 22권 1호

- 페이지 : pp.124-153 ( 총 30 페이지 )


학술발표대회집, 워크숍 자료집 중 1,2 페이지 논문은 ‘요약’만 제공되는 경우가 있으니,

구매 전에 간행물명, 페이지 수 확인 부탁 드립니다.

7,000
논문제목
초록(외국어)
The present study addresses how English expressions produced by Korean native speakers are close to common expressions used by English native speakers. To this end, this article provides a quantitative study of the Yonsei English Learner Corpus using a skill set derived from computational linguistics. The focus of the current work is on a language model of English texts written by Korean university students. A language model refers to a collection of logarithmic N-grams described in the ARPA format, and this model serves to discriminate native-like sentences from awkward sentences. The present study compares a language model acquired from an L2 corpus to the other language models acquired from two L1 corpora in English: namely, English Gigaword and Europarl. The present study utilizes Genia Sentence Splitter to separate the sentences and SRILM to create the language models in a computationally tractable way. On the one hand, a deep analysis of N-grams is presented. This analysis consists of two subtasks. First, the N-grams are tallied and evaluated using common metrics of computational linguistics. Second, as an evaluation of the language model, the perplexity of each language model is measured and compared to a reference point drawn from five test data sources. On the other hand, an analysis of linear regression is made so as to detect the patterns of overused and underused expressions in English texts written by Korean speakers.

논문정보
  • - 주제 : 어문학분야 > 언어학
  • - 발행기관 : 한국언어정보학회
  • - 간행물 : 언어와 정보, 22권 1호
  • - 발행년도 : 2018
  • - 페이지 : pp.124-153 ( 총 30 페이지 )
  • - UCI(KEPA) : I410-ECN-0102-2018-700-004278582
저널정보
  • - 주제 : 어문학분야 > 언어학
  • - 성격 : 학술지
  • - 간기 : 반년간
  • - 국내 등재 : KCI 등재
  • - 해외 등재 : -
  • - ISSN : 1226-7430
  • - 수록범위 : 1997–2022
  • - 수록 논문수 : 328