Factors influencing the difficulty of listening tasks
Of the four main language skills, listening has the distinction of having long been the least researched and least understood (Bae & Bachman, 1998; Vandergrift, 2007; Brunfaut, 2016). Dunkel (1991) points out that, just as listening input plays a crucial role in the development of first language (L1) ability, it also plays an essential role in the development of second or foreign language (L2) ability, “particularly at the beginning stages of language development” (p. 345). Developing listening comprehension today is part of the typical second and foreign language learning curriculum or testing system. Some even believe speaking exercises should not be introduced until the learner has been exposed to sufficient listening input (Feyten, 1991).
There are numerous factors that research has indicated as influencing the difficulty of listening tasks and items. These can in turn be divided into features of the input (listening materials used), of the task and of the test environment. Other factors are intrinsic to the learner him/herself. Examples would be listener anxiety, motivation, concentration, intelligence etc. Since this paper aims to identify the application of L2 listening research on test development, indigenous factors related to the listener him/herself will not be listed here.
Factors related to the input (listening materials)
Linguistic complexity - Linguistic complexity is found to make a text more difficult to understand (Dunkel, 1991; Brunfaut, 2016). Linguistic complexity includes a range of factors such as “phonological, lexical, syntactical and discourse features” (Brunfaut, 2016, p. 102). Phonological features of the listening input include features of connected speech such as assimilation, elision, intrusion and weak forms. Because many native speakers are unaware of the phonological changes made in connected speech, they make it harder for L2 listeners to recognize words and word boundaries (Buck, 2001). We can control lexical complexity by increasing or decreasing the number of words containing more than two syllables, avoiding academic or professional jargon and other uncommon words. Compared to the entire passage, linguistic complexity of the key information (that part of the input integral to answering a question) as well as the language surrounding it has the greatest impact on task difficulty (Buck, 1991; Brindley & Slatyer, 2002). One specific facet of syntactic complexity identified by Freedle and Kostin (1996) is the use of negative sentences. The use of negatives in key sections of the listening passage as well and the item stem has been found to increase the difficulty of that item.
Lexical overlap - A lack of direct lexical overlap has been found to contribute to listening task difficulty (Jensen et al., as quoted in Brunfaut, 2016). This means that language used in tasks and items targeting higher proficiency levels the language should paraphrase the content found in the actual listening script. Freedle and Kostin (1996) come to a more detailed conclusion: items targeting gist are harder if there is overlap between the first 20 words of the listening passage and the distractors but easier if overlap exists between the first 20 words and the key. Similarly, in items targeting listening for specific information it was found that lexical overlap between the listening passage and the key made the item easier while overlap with distractors makes items harder (Freedle & Kostin, 1996). Overlap between the item stem and the listening passage significantly decreases difficulty (Brindley & Slatyer, 2002).
Text type - Overall, academic topics tend to be more difficult for listeners (Freedle & Kostin, 1996). There is also some evidence suggesting that dialogues between two speakers are easier to understand than monologues (Papageorgiou, Stevens and Goodwin, 2012). Brindley and Slatyer (2002) point at the ‘orality’ of the listening passage used being a factor in difficulty. By orality is meant the degree to which the listening passages resemble real-life speech as opposed to written text read aloud. Input that resembles real-life speech is easier to understand (Shohamy & Inbar, 1991).
Authenticity - While many agree it is important to use authentic listening materials, using graded materials for lower level students is often advised due to the difficulty of finding authentic materials on topics fit for beginning learners (Vandergrift, 2007). Others have advocated introducing authentic material from the start. They believe the shock to language students when they are exposed to authentic material after having worked with scripted and graded materials is harmful to their learning process (Field, 2008). This would suggest scripted materials do not prepare students for real-life listening, which should be the aim of listening comprehension practice.
Rate of speech - Opinions are divided on whether rate of speech (speed) is a significant factor in overall difficulty of listening materials. Carroll (as quoted in Dunkel, 1991) contends that high rates of speech make an item more difficult only when listening materials used are not well-organized or conceptually difficult. Sheils (as quoted in Dunkel, 1991) and Brindley and Slatyer (2002) explicitly list rate of speech as a factor negatively affecting L2 comprehension. Vandergrift (2007) holds “slowing down the rate of speech is not necessarily helpful for comprehension purposes” (p. 200).
Length - Length of the listening input has been claimed to negatively influence comprehension (Dunkel, 1991). Freedle and Kostin (1996) concluded that neither total length of the passage nor sentence length significantly contributed to item difficulty. They did however find that difficulty was increased by the length of text preceding the key information targeted by the item. Additionally, the number of words in the item stem and distractors were found to increase difficulty in listening tests.
Factors related to test situation
Number of times listening - The established view in the literature is that repeated listening substantially improves comprehension (Dunkel, 1991; Brindley & Slatyer, 2002; Brunfaut, 2016). Hence, some lower-level tests allow a second listening, such as the lower level Cambridge exams.
Instructions - Written instructions carries with them the risk of construct-irrelevance. This means test takers misunderstanding the instructions could negatively influence their performance on the task, even though reading is not meant to be tested on a listening exam. Nonetheless, written instructions serve a crucial purpose: Long (1989) cites the presence of textual schemata of improving comprehension. This means lower level listeners are less able to activate relevant schemata by recognizing the discourse of the input material but instead need to have the context more explicitly pointed out to them. This is achieved by including clear instructions with listening items specifying the speaker, topic and the situation. Rather than making the item too easy, this makes it more authentic as listeners in real life would normally be aware of these things as spoken input is rarely completely devoid of context. Instructions should be worded using lexis and syntax patterns of a level lower than the level targeted by the listening task they introduce.
Time – Setting a time limit to answer a constructed response task is a practical necessity. It can, however, lead to situations where test takers who have perfectly understood the listening passage get in trouble because they think for too long or write too much (Buck, 1991). This problem can be mitigated by giving clear instructions on how much test takers are expected to write and how much time they have for it.
Factors related to the task
Finally, the task type used has a large influence on difficulty. Task-related factors can be divided into different categories:
Task type – Different item types target different cognitive processes (Brindley & Slatyer, 2002). Multiple-choice questions (MCQs) have been found by some studies to be the easiest for test takers (Berne, as cited in Brindley & Slatyer, 2002). This likely has to with test takers merely being required to recognize the correct answer and cram schools training students specific test-taking strategies based on identifying distractors. MCQs have the advantage of being easy to mark objectively by machine and are therefore cheap (Taylor & Geranpayeh, 2011). They also reduce the risk of construct-irrelevant variance as they do not place undue demands on the listener’s speaking or writing skills. For written response-type questions the length of the required response is positively correlated with difficulty (Brindley & Slatyer, 2002). A longer response thus makes for a more difficult task but has several disadvantages: beside the time-related concerns mentioned earlier, there is more construct-irrelevance involved as the quality of the test taker’s response depends not only on their listening ability but also on their writing skills. Moreover, lengthy responses are very difficult to mark as no two responses will be identical, making marking more subjective, thus compromising inter-rater reliability (Buck, 1991).
Information targeted – Another factor widely stated to decrease difficulty is the explicitness of the input targeted by the listening task or item (Buck, 1991; Dunkel, 1991; Freedle & Kostin, 1996). The more inference is required by the listener the harder the text will be. This means items targeting facts and concrete details are easier than ones targeting gist or the speaker’s attitude. The latter require a higher understanding beyond the phonemic, lexical and syntactic levels, which are described by Field (2008) as ’lower level processing’. Furthermore, items targeting information at the end of a listening passage are normally easier (Freedle & Kostin, 1996) as this information will be the most likely to stick in the listener’s memory.
Influencing the difficulty of listening tasks is a complicated task. While the information listed above is understandably a lot to take in, the first step is to determine the aim of the test and what listening tasks would be relevant and at what level. Next, consider the level of the students and what tasks they can realistically be expected to deal with at which level. Finally, in adapting the difficulty to your desired level, consider factors based on listening input, test situation and nature of the task listed in this article.
Bae, J., & Bachman, L. F. (1998). A latent variable approach to listening and reading: testing factorial invariance across two groups of children in the Korean/English Two-Way Immersion Program. Language Testing, 15, 380-414.
Brindley, G., & Slatyer, H. (2002). Exploring task difficulty in ESL listening assessment. Language Testing, 19, 369-394.
Brunfaut, T. (2016). Assessing listening. In Tsagari, D. & Banerjee, J. (Eds.), Handbook of second language assessment (pp. 97-112). Berlin: De Gruyter Mouton.
Buck, G. (1991). The testing of listening comprehension: an introspective study. Language Testing, 8, 67-91.
Buck, G. (2001). Assessing listening. Cambridge: Cambridge University Press.
Dunkel, P. (1991) Listening in the native and second/foreign language: toward an integration of research and practice. TESOL Quarterly, 25, 431-457.
Feyten, C. M. (1991). The power of listening ability: an overlooked dimension in language acquisition. The Modern Language Journal, 75, 173-180.
Field, J. (2008). Listening in the language classroom. Cambridge: Cambridge University Press.
Freedle, R., & Kostin, I. (1996) The prediction of TOEFL listening comprehension item difficulty for minitalk passages: implication for construct validity. ETS research report series, 56. Princeton, NJ: Educational Testing Service.
Long, D. (1989). Second language listening comprehension: a schema-theoretic perspective. Modern Language Journal, 73, 32-40.
Papageorgiou, S., Stevens, R., & Goodwin, S. (2012). The relative difficulty of dialogic and monologic input in a second language listening comprehension test. Language Assessment Quarterly, 9, 375-397.
Shohamy, E., & Inbar, O. (1991). Validation of listening comprehension tests: the effect of text and question type. Language Testing, 8, 23-40.
Taylor, L., & Geranpayeh, A. (2011). Assessing listening for academic purposes: defining and operationalising the test construct. Journal of English for Academic Purposes, 10, 89-101.
Vandergrift, L. (2007). Recent developments in second and foreign language listening comprehension research. Language Teaching, 40, 191-210.