Language Education

A Corpus-based Analysis of Collocations in Korean Middle and High School English Textbooks

Young Shin Kim1, Sun-Young Oh2,
Author Information & Copyright
1Hansan Middle School
2Seoul National University
Corresponding Author: Professor Department of English Language Education Seoul National University 1 Gwanak-ro, Gwanak-gu Seoul, 08826, Korea E-mail:

ⓒ Copyright 2020 Language Education Institute, Seoul National University. This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: Oct 30, 2020 ; Revised: Dec 21, 2020 ; Accepted: Dec 30, 2020

Published Online: Dec 31, 2020


This study analyzed collocations in Korean middle and high school English textbooks based on the 2015 revised national curriculum. All 1,718 nouns from the curriculum wordlist were selected as node words and paired with their collocates statistically verified using a billion-word reference corpus to better represent the existing lexical syllabus. The analysis revealed that collocation density was higher in the textbooks, with readers encountering one collocation per 16–17 words. However, collocations in the textbooks showed insufficient repetition and a narrower range of association strength. Fewer repetitions and a limited collocational repertoire led to a weaker correlation between the two variables. This suggests that Korean learners may not benefit enough from frequent encounter to consolidate their lexical knowledge or distinguish different collocational strength levels. These findings call for considering the level of repetition and association strength of collocations in developing the English curriculum and materials.

Keywords: collocation; density; repetition; association strength; English textbooks



Algeo, J. (2006). British or American English?: A handbook of word and grammar patterns. Cambridge. England: Cambridge University Press .


Baisa, V., & Suchomel, V. (2014). SkELL: Web interface for English language learning. In Horák, A. & Rychlý, P (Eds.), Proceedings of recent advances in slavonic natural language processing (pp. 63-70). Karlova Studánka, Czech Republic: Tribun EU .


Baroni, M., Kilgarriff, A., Pomikálek, J., & Rychlý, P. (2006). WebBootCaT: a web tool for instant corpora. In E. Corino, et al. (Eds.), Vol. 1. Proceeding of the EuraLex conference (Vol. 1, pp.123-132). Turin, Italy: Alessandria .


Bartsch, S., & Evert, S. (2014). Towards a Firthian notion of collocation. Vernetzungsstrategien, Zugriffsstrukturen und Automatisch Ermittelte Angaben in Internetwörterbüchern, 2, 48-61 .


Biber, D., & Clark, V. (2002). Historical shifts in modfication patterns with complex noun phrase structures. In T. Fanego, J. Pe'rez-Guerra, & M. J. Lo'pez-Couso (Eds.), English historical syntax and morphology (pp. 43-66). Amsterdam: John Benjamins .


Biber, D., & Gray, B. (2011). Grammatical change in the noun phrase: The influence of written language use. English Language and Linguistics, 15(2), 223-250 .


Boers, F., & Lindstromberg, S. (2009). Optimizing a lexical approach to instructed second language acquisition. Basingstoke, UK: Palgrave Macmillan .


Chen, W. (2019). Profiling collocations in EFL writing of Chinese tertiary learners. RELC Journal, 50(1), 53-70 .


Choi, H. Y., & Chon, Y. V. (2012). A corpus-based analysis of collocations in tenth-grade high school English textbooks. Multimedia Assisted Language Learning, 15(2), 41-73 .


Conklin, K., & Schmitt, N. (2012). The processing of formulaic language. Annual Review of Applied Linguistics, 32, 45-61 .


Cowie, A. P. (1992). Multiword lexical units and communicative language teaching. In P.J.L. Arnaud, & H. Béjoint (Eds.), Vocabulary and applied linguistics (pp. 1-12). London: Palgrave Macmillan .


Cowie, A. P. (1998). Phraseology: Theory, analysis, and applications. Oxford, England: Oxford University Press .


Durrant, P. L. (2008). High frequency collocations and second language learning (Doctoral dissertation). University of Nottingham, United Kingdom .


Durrant, P. L., & Schmitt, N. (2009). To what extent do native and non-native writers make use of collocations? International Review of Applied Linguistics in Language Teaching, 47(2), 157-177 .


Durrant, P. L., & Schmitt, N. (2010). Adult learners' retention of collocations from exposure. Second Language Research, 26(2), 163-188 .


Ellis, N. C. (1996). Sequencing in SLA: Phonological memory, chunking, and points of order. Studies in Second Language Acquisition, 18(1), 91-126 .


Ellis, N. C. (2001). Memory for language. In P. Robinson (Ed.), Cognition and second language Instruction (pp. 33-68). Cambridge, England: Cambridge University .


Ellis, N. C. (2002). Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition. Studies in Second Language Acquisition, 24(2), 143-188 .


Erman, B., & Warren, B. (2000). The idiom principle and the open choice principle. Text-Interdisciplinary Journal for the Study of Discourse, 20(1), 29-62 .


Firth, J. R. (1957). Papers in Linguistics 1934-1951. London: Oxford University Press .


Frankenberg-Garcia, A., Lew, R., Roberts, J. C., Rees, G. P., & Sharma, N. (2019). Developing a writing assistant to help EAP writers with collocations in real time. ReCALL, 31(1), 23-39 .


Gries, S. T. (2010). Useful statistics for corpus linguistics. In A. Sánchez, M. Almela, (Eds)., A mosaic of corpus linguistics: selected approaches (pp. 269-291), Frankfurt, Germany: Peter Lang .


Guiraud, P. (1954). Les caractères statistiques du vocabulaire: essai de méthodologie. Paris, France: Presses universitaires de France .


Harmer, J., & Rossner, R. (1997). More than words: Vocabulary for upper intermediate to advanced students. Essex: Addison Wesley Longman .


Hill, J. (2000). Revising priorities: From grammatical failure to collocational success. In M.Lewis (Ed.), Teaching collocation: Further developments in the lexical approach, (pp. 47-69). Hove, England: Language Teaching Publications .


Hoey, M. (2005). Lexical priming: A new theory of words and language. London: Routledge .


Howarth, P. (1998). Phraseology and second language proficiency. Applied Linguistics, 19(1), 24-44 .


Hunston, S. (2002). Corpora in applied linguistics. Cambridge. England: Cambridge University Press .


Kim, N. B. (2004). Collocational analysis of Korean high school English textbooks and suggestions for collocation instruction. English Language & Literature Teaching, 10(3), 41-66 .


Koprowski, M. (2005). Investigating the usefulness of lexical phrases in contemporary coursebooks. ELT Journal, 59(4), 322-332 .


Korean Ministry of Education. (2018). An inquiry into the organization and operation of optional subjects following the 2015 revised curriculum system (No. 11-1342000-000359-01). Retrieved from;jsessionid=BA5D D2A074E12CDE35BC8FB315538EB3.node02?cond_research_name=&cond_research_start_date=&cond_research_end_date=&research_id=1342000-201900002&pageIndex=3&leftMenuLevel=160 .


Church, K. W., & Hanks, P. (1990). Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1), 22-29 .


Koya, T. (2004). Collocation research based on corpora collected from secondary school textbooks in Japan and in the UK. Dialogue, 3(3), 7-18 .


Laufer, B., & Waldman, T. (2011). Verb‐noun collocations in second language writing: A corpus analysis of learners' English. Language Learning, 61(2), 647-672 .


Lea, D., & Runcie, M. (2002). Blunt instruments and fine distinctions: A collocations dictionary for students of English. In A. Braasch, & C. Povlsen (Eds.), Proceedings of the tenth EURALEX International Congress (pp. 819-829). Copenhagen, Denmark .


Lee, J. K. (2009). Analysis of collocability of word list in the revised national curriculum. Studies in Modern Grammar, 58, 249-271 .


Lee, J. K. (2015). The repetition of chunks in Korean middle school English textbooks. English Language Teaching, 8(10), 60-75 .


Lee, M. B., & Shin, D. K. (2015). Development of the Korean basic English word list of the 2015 revised national curriculum of English. Journal of the Korea English Education Society, 14(4), 115-134 .


McCarthy, M., & O'Dell, F. (1994). English vocabulary in use: 100 units of vocabulary reference and practice. Cambridge, England: Cambridge University Press .


Möller, V. (2017). A statistical analysis of learner corpus data, experimental data and individual differences: Monofactorial vs. multifactorial approaches. In P. J. de Haan, C. M. de Vries, & S. V. Vuuren (Eds.), Language, learners and levels: Progression and variation (pp. 409-439). Louvain-la-Neuve: Presses Universitaires de Louvain .


Nattinger, J. R., & DeCarrico, J. S. (1992). Lexical phrases and language teaching. Oxford, England: Oxford University Press .


Nesselhauf, N. (2003). The use of collocations by advanced learners of English and some implications for teaching. Applied Linguistics, 24(2), 223-242 .


Paquot, M. (2018). Phraseological competence: A missing component in university entrance language tests? Insights from a study of EFL Learners' use of statistical collocations. Language Assessment Quarterly, 15(1), 29-43 .


Paquot, M. (2019). The phraseological dimension in interlanguage complexity research. Second Language Research, 35(1), 121-145 .


Pawley, A., & Syder, F. H. (1983). Two puzzles for linguistic theory: Nativelike selection and nativelike fluency. In J. C. Richards & R. W. Schmidt (Eds.), Language and communication (pp. 191-225). London: Longman .


Rychlý, P. (2008). A lexicographer-friendly association score. In P. Sojka, & A. Horák (Eds.), Proceedings of recent advances in Slavonic natural language processing (pp. 6-9). Brno, Czech Republic: Masaryk University .


Schmid, H. J. (2003). Collocation: Hard to pin down, but bloody useful. Zeitschrift Fur Anglistik Und Amerikanistik, 51(3), 235-258 .


Schmitt, N (Ed.) (2010). Researching vocabulary: A vocabulary research manual. Basingstoke, England: Palgrave Macmillan .


Segalowitz, N. (2010). Cognitive bases of second language fluency. London: Routledge .


Shin, D. K. (2019). A comparative study on the use of single words and collocations in domestic and overseas. Journal of Language Science, 26(4), 87-108 .


Sinclair, J. (1991). Corpus, concordance, collocation. Oxford, England: Oxford University Press .


Stubbs, M. (1995). Collocations and semantic profiles: On the cause of the trouble with quantitative studies. Functions of Language, 2(1), 23-55 .


Stubbs, M. (2001). Words and phrases: Corpus studies of lexical semantics. Oxford, England: Blackwell Publishers .


Tsai, K. J. (2015). Profiling the collocation use in ELT textbooks and learner writing. Language Teaching Research, 19(6), 723-740 .


Wang, L., & Pei, F. (2015). Types and features of noun phrase in Chinese scholars' abstracts. International Journal of English Linguistics, 5(6), 84-94 .


Webb, S., Newton, J., & Chang, A. (2013). Incidental learning of collocation. Language Learning, 63(1), 91-120 .


Wolter, B., & Gyllstad, H. (2013). Frequency of input and L2 collocational processing. Studies in Second Language Acquisition, 35(3), 451-482 .


Wray, A. (2005). Formulaic language and the lexicon. Cambridge, England: Cambridge University Press .