The purpose of the present study is to assess second language (L2) spoken English using automated scoring techniques. Automated scoring aims to classify a large set of learners` oral performance data into a small number of discrete oral proficiency levels. In automated scoring, objectively measurable features such as the frequencies of lexical and grammatical items are generally used as "exploratory variables" to predict oral proficiency levels, any of which can be used as a "criterion variable" in this study. We have chosen the NICT JLE Corpus, a corpus of 1,281 Japanese EFL learners` speech productions coded into nine oral proficiency levels (Izumi, Uchimoto, & Isahara, 2004). The nine oral proficiency levels were used as the criterion variables and linguistic features analyzed in Biber (1988) as explanatory variables. We employed random forests (Breiman, 2001), a powerful method for text classification and feature extraction, to predict oral proficiency. As a result of random forests with the out-of-bag error estimate, 60.11% of the productions were correctly classified. Compared to the baseline accuracy of the simplest possible algorithm of always choosing the most frequent level (37.63%), our random forests model improved prediction by 22.48 points. The Pearson product-moment correlation coefficient with human scoring was 0.85. Predictors that showed a clear discrimination of oral proficiency levels were tokens, types, and the frequency of nouns in the order of strength.
- 주제 : 어문학분야 > 언어학
- 발행기관 : 범태평양 응용언어학회
- 간행물 : Journal of Pan-Pacific Association of Applied Linguistics (Journal of PAAL ), 20권 1호