Monday, May 2, 2016

Tagged Under: , ,

Commentary: Visuals and Listening Comprehension

In a recent issue of Language Testing, Batty (2015) explores the long-standing question of video-based vs. strictly audio material to test listening comprehension. Is a test where the speakers can be seen on a screen easier than a test where only the voices are heard? As always, I find the literature review almost more interesting and insightful than that the “present study” itself. For one, it saves me the time of researching and reviewing the literature. I also find it interesting to see the author’s take on all of these findings.

In this particular article, after he discourses on the difficulties of defining and measuring the construct of listening, he turns his focus to the role of visual aids and nonverbal communication. He notes that apparently, “70% of the meaning in a social situation is encoded in the visual…channel” (p. 5). Like other researchers I have read, he argues, “Listening rarely occurs ‘in the dark’. For sighted people, in virtually all situations and circumstances, listeners can see those speaking to them” (p. 5). Subsequently, he does admit that telephone conversations and podcasts are exceptions.

This has led me to reflect on the various media that I listen to in my own life, as well as the degree to which I use nonverbal communication to ascertain meaning. I am not sure that I agree with the conclusion that nearly all listening happens face-to-face. In my personal life, I spend a lot of time talking on the phone to family and friends who are far away. Indeed technology has influenced my communication style with the added formats of online chatting, email, and text. But honestly, one of my favorite apps is Voxer, which allows me to send short audio messages (like texts). Thus, a lot of my day-to-day communication with family happens through listening to these short clips of my nieces and nephews telling me what they are playing, etc. (I have found this to be a more effective way to keep in touch than Apple’s Facetime or Skype, where I watch dizzying videos of the room twirling around me as my three-year-old niece dances and temporarily forgets that we are even having a conversation.) In addition to the audio-only phone conversations, I also spend a lot of time listening to news radio, audiobooks, lectures, and podcasts in my car and as I exercise. And, going back to my college days, I remember the experience of countless lectures in huge auditoriums (where I could barely see the professor, let alone watch his body language) where I sat eagerly scribbling notes. As I watch the students in my ESL listening class, they are doing the same thing. Talking head or no, they have their noses in the paper, eagerly trying to answer the questions. Or, even when there are no questions, they are not necessarily focused on the screen.

I suspect that the role of nonverbal communication depends on the type of listening task. I believe that as ESL instructors, we might assume that a video is better than audio only. Academic listening textbooks, such as the Contemporary Topics (Beglar & Murray, 2009) series often contain video recordings of the lectures. However, because the lectures are more based in the liberal arts (e.g. themes of communication, social trends, etc.), there aren’t really any valuable visuals to help listeners understand the material. Really, it is just a talking head. Regardless of whether or not the speaker’s facial expressions and hand gestures truly help comprehension, listeners are not necessarily motivated to pay attention visually. However, for topics that require visual aids (for native and non-native speakers alike), obviously a video would be more effective. Take the lecture videos available on the Khan Academy website. Khan Academy hosts a multitude of resources on academic topics: mathematics, chemistry, accounting, etc. These don’t have a talking head at all, but instead employ video-recorded screen capture techniques. This enables the viewer to see the lecturer writing on a chalkboard, drawing diagrams, and showing pictures. These types of visuals would be more likely to enhance listening comprehension.

Returning to the article at hand, Batty goes on to discuss many studies that have been conducted looking specifically at visuals in an assessment context. Naturally, he explains the limitations of all of these studies (e.g. different questions were used, only used classical test theory), but it seems that several of the studies he reports concluded that there was no significant difference in performance on audio-only and video-mediated tests. One that showed the opposite was a study involving a French test with three versions: strictly audio, video, and completely silent (only the video) (Baltova 1994). Surprisingly, the examinees performed similarly on the video and the silent tests. This leads me to wonder if having the visual cues and nonverbal communication enhances the true listening comprehension or if it helps the listener to infer what they didn’t really hear or understand. Perhaps it doesn’t matter in real world listening. But if the object is to isolate the construct of listening for testing purposes, this might be a consideration. Batty confirms my conclusion by saying the following:
Anyone who has traveled in a foreign country with out knowing the local language knows that a great deal of information can be passed with simple hand gestures and a few words, but a test of foreign language listening comprehension is typically concerned with mastery of the language itself, not that of pan-cultural, ad-hoc, gesture-based communication. (p. 17)

I completely agree with Batty here. As a student of Russian as a second language, I remember trying to communicate on the streets of St. Petersburg, and probably 70% of the meaning I was able to take away from what a stranger on the street was saying to me was from his gestures (e.g. in essence telling me, “What are you doing on the streets in -40 degree weather without your face covered with a scarf?!”). However, as one attains more proficiency, we hope that more of the meaning (the literal meaning, at least) comes from the language itself and less from this type of body language.

After Batty reviews this literature, he presents his study. The novel thing about it is the use of the many-facet Rasch analysis, which as we know, is all the rage now in testing research. Still, Batty’s study concludes that there is no real difference in performance on the audio-only and video tests, regardless of proficiency level. He says, “Overall, it seems likely that the divide in the research over whether video has a facilitative or no effect can be more easily explained by differences in test design than anything inherent to the format of delivery” (p. 17).

After digesting this article and others like it, I am left wondering if it is really possible to truly assess listening comprehension. There are so many variables involved—nonverbal gestures or visual aids that may convey the message, reading skills required to differentiate between multiple-choice options, etc.—it seems nearly impossible to really isolate the construct of listening ability. Both as an instructor and also the Testing Coordinator, I spend a lot of time looking at test scores. What is the difference between a student who got 86% correct and a student who got 80% correct? Why did a particular student get a question wrong? Was it because of proficiency, or was it that student’s interaction with the test item? Is there really such a thing as a perfect test question? I feel that the tests we give in English programs can give us only a ballpark estimate of student ability, which perhaps is sufficient in identifying low versus high performers. However, because these are real human beings, even sophisticated measures such as the many-faceted Rasch analysis can only tell us so much.


Batty, A. (2015). A comparison of video- and audio-mediated listening tests with many-facet Rasch modeling and differential distractor functioning. Language Testing, 32(1), 3–20.

Beglar, D. & Murray, N. (2009). Contemporary Topics 3: Academic Listening and Note-Taking Skills, 3rd Ed. Pearson Longman.