Exploring Learner Language Through Corpora: Comparing and Interpreting Corpus Frequency Information
Language Learning / Language and Learning
Published online on March 15, 2017
Abstract
This article contributes to the debate about the appropriate use of corpus data in language learning research. It focuses on frequencies of linguistic features in language use and their comparison across corpora. The majority of corpus‐based second language acquisition studies employ a comparative design in which either one or more second language (L2) corpora are compared to a first language (L1) production corpus or two or more L2 corpora are compared to each other. This article critically examines some of the central tenets of the comparative method related to the interspeaker variation in L1 and L2 use, the representativeness and comparability of corpus data, the interpretation of difference found between corpora and the appropriate use of statistics. Using and discussing a set of five L1 spoken English corpora and three L2 English corpora (two spoken and one written), we approach these areas empirically exploring different sources of variations and methodological options that corpus‐based SLA studies offer.