In a recent issue of Language
Testing, Batty (2015) explores the long-standing question of video-based
vs. strictly audio material to test listening comprehension. Is a test where
the speakers can be seen on a screen easier than a test where only the voices
are heard? As always, I find the literature review almost more interesting and
insightful than that the “present study” itself. For one, it saves me the time
of researching and reviewing the literature. I also find it interesting to see
the author’s take on all of these findings.
In this particular article, after he discourses on the
difficulties of defining and measuring the construct of listening, he turns his
focus to the role of visual aids and nonverbal communication. He notes that
apparently, “70% of the meaning in a social situation is encoded in the
visual…channel” (p. 5). Like other researchers I have read, he argues,
“Listening rarely occurs ‘in the dark’. For sighted people, in virtually all
situations and circumstances, listeners can see those speaking to them” (p. 5).
Subsequently, he does admit that telephone conversations and podcasts are
exceptions.
This has led me to reflect on the various media that I
listen to in my own life, as well as the degree to which I use nonverbal
communication to ascertain meaning. I am not sure that I agree with the
conclusion that nearly all listening happens face-to-face. In my personal life,
I spend a lot of time talking on the phone to family and friends who are far
away. Indeed technology has influenced my communication style with the added
formats of online chatting, email, and text. But honestly, one of my favorite
apps is Voxer, which allows me to send short audio messages (like texts). Thus,
a lot of my day-to-day communication with family happens through listening to
these short clips of my nieces and nephews telling me what they are playing,
etc. (I have found this to be a more effective way to keep in touch than
Apple’s Facetime or Skype, where I watch dizzying videos of the room twirling
around me as my three-year-old niece dances and temporarily forgets that we are
even having a conversation.) In addition to the audio-only phone conversations,
I also spend a lot of time listening to news radio, audiobooks, lectures, and
podcasts in my car and as I exercise. And, going back to my college days, I
remember the experience of countless lectures in huge auditoriums (where I
could barely see the professor, let alone watch his body language) where I sat
eagerly scribbling notes. As I watch the students in my ESL listening class,
they are doing the same thing. Talking head or no, they have their noses in the
paper, eagerly trying to answer the questions. Or, even when there are no
questions, they are not necessarily focused on the screen.
I suspect that the role of nonverbal communication depends
on the type of listening task. I believe that as ESL instructors, we might
assume that a video is better than audio only. Academic listening textbooks,
such as the Contemporary Topics
(Beglar & Murray, 2009) series often contain video recordings of the
lectures. However, because the lectures are more based in the liberal arts
(e.g. themes of communication, social trends, etc.), there aren’t really any
valuable visuals to help listeners understand the material. Really, it is just
a talking head. Regardless of whether or not the speaker’s facial expressions
and hand gestures truly help comprehension, listeners are not necessarily
motivated to pay attention visually. However, for topics that require visual
aids (for native and non-native speakers alike), obviously a video would be
more effective. Take the lecture videos available on the Khan Academy website. Khan
Academy hosts a multitude of resources on academic topics: mathematics,
chemistry, accounting, etc. These don’t have a talking head at all, but instead
employ video-recorded screen capture techniques. This enables the viewer to see
the lecturer writing on a chalkboard, drawing diagrams, and showing pictures.
These types of visuals would be more likely to enhance listening comprehension.
Returning to the article at hand, Batty goes on to discuss
many studies that have been conducted looking specifically at visuals in an
assessment context. Naturally, he explains the limitations of all of these
studies (e.g. different questions were used, only used classical test theory),
but it seems that several of the studies he reports concluded that there was no
significant difference in performance on audio-only and video-mediated tests.
One that showed the opposite was a study involving a French test with three
versions: strictly audio, video, and completely silent (only the video)
(Baltova 1994). Surprisingly, the examinees performed similarly on the video
and the silent tests. This leads me to wonder if having the visual cues and
nonverbal communication enhances the true listening comprehension or if it
helps the listener to infer what they didn’t really hear or understand. Perhaps
it doesn’t matter in real world listening. But if the object is to isolate the
construct of listening for testing purposes, this might be a consideration.
Batty confirms my conclusion by saying the following:
Anyone who has traveled in a
foreign country with out knowing the local language knows that a great deal of
information can be passed with simple hand gestures and a few words, but a test
of foreign language listening comprehension is typically concerned with mastery
of the language itself, not that of pan-cultural, ad-hoc, gesture-based
communication. (p. 17)
I completely agree with Batty here. As a student of Russian
as a second language, I remember trying to communicate on the streets of St.
Petersburg, and probably 70% of the meaning I was able to take away from what a
stranger on the street was saying to me was from his gestures (e.g. in essence
telling me, “What are you doing on the streets in -40 degree weather without
your face covered with a scarf?!”). However, as one attains more proficiency,
we hope that more of the meaning (the literal meaning, at least) comes from the
language itself and less from this type of body language.
After Batty reviews this literature, he presents his study.
The novel thing about it is the use of the many-facet Rasch analysis, which as
we know, is all the rage now in testing research. Still, Batty’s study
concludes that there is no real difference in performance on the audio-only and
video tests, regardless of proficiency level. He says, “Overall, it seems
likely that the divide in the research over whether video has a facilitative or
no effect can be more easily explained by differences in test design than
anything inherent to the format of delivery” (p. 17).
After digesting this article and others like it, I am left
wondering if it is really possible to truly assess listening comprehension.
There are so many variables involved—nonverbal gestures or visual aids that may
convey the message, reading skills required to differentiate between
multiple-choice options, etc.—it seems nearly impossible to really isolate the
construct of listening ability. Both as an instructor and also the Testing
Coordinator, I spend a lot of time looking at test scores. What is the
difference between a student who got 86% correct and a student who got 80%
correct? Why did a particular student get a question wrong? Was it because of
proficiency, or was it that student’s interaction with the test item? Is there
really such a thing as a perfect test question? I feel that the tests we give
in English programs can give us only a ballpark estimate of student ability,
which perhaps is sufficient in identifying low versus high performers. However,
because these are real human beings, even sophisticated measures such as the
many-faceted Rasch analysis can only tell us so much.
References
Batty, A. (2015). A comparison of video- and audio-mediated
listening tests with many-facet Rasch modeling and differential distractor
functioning. Language Testing, 32(1),
3–20.
Beglar, D. & Murray, N. (2009). Contemporary Topics 3:
Academic Listening and Note-Taking Skills, 3rd Ed. Pearson Longman.