Turn on, Tune in, and Drop out (by Rhonda Chung)

This week’s blog post includes a linked audio file. Just click on the link below if you would like to hear the post read aloud. Scroll down to read the text.

With the invention of the telescope, science eliminated distance, quipped Melquíades in Gabriel García Márquez’s One Hundred Years of Solitude. Melquíades, a supernatural figure, continued: ‘In a short time, man will be able to see what is happening in any place in the world without leaving his own house.’

That time is here.

But true travel only happens on the inside. At least that’s what cognitive psychologists would have you believe.

It doesn’t matter where your feet have touched…

…they will tell you:

Reality is just a figment of your overactive, exemplar-building imagination.

This is because the eyes don’t see and the ears don’t hear: they merely conduct. Perceptual organs serve only to usher stimuli into the body’s main computing device, the brain—an interconnected web of gray matter ceaselessly crunching the numbers to form categories of colour and sound based on frequency in the input.

These folks must be real fun at parties.

From prototype to exemplar theory, cognitive psychology has got the mind all mapped out—if it ain’t on a chart, does it really exist?

Rather than strip away the wide variety found in our input in order to store canonical representations, as prototype theory explains, Gibson (1966) wondered if our cognition wasn’t more about attunement to the richness found in the environment: what if there was a need to notice differences?

Gibson’s theory of Direct Realism for visual perception had a profound effect on Best’s (1995) Perceptual Assimilation Model, an oft-cited speech perception learning model in SLA, which focuses on novice learners; specifically, their ability to attune to the “constellation” of individual sounds that form the target language, and to incorporate these sounds into their existing linguistic repertoire.

Best took aim at the very notion of phonetic category, specifically the idea of the infamous “L1 filter” because, she argued, it was paradoxical. If we can only perceive the world through the filter of our first language, we could never hope to learn anything new. Speech perception would be nothing more than an auditory hall of mirrors. Instead, she countered, perception was about attuning to the acoustic environment.

But the thing is… speech perception is not entirely acoustic. The McGurk effect taught us that. Speech perception is multi-modal with emotional states (Costanzo, et al., 1969) and movements of the face (Hardison, 2003), eyes (Hattori, 1987), and lips (Hardison, 1999; Massaro et al., 1993) all relaying crucial information to the perceiver. Yet so much of speech perception within SLA research continues to examine language below the sentence level (Thomson, 2018), wondering how isolated phones are perceived, and seemingly ignoring the fact that, yes, language is the sum of its phonetic parts, but we experience it in syntactic waves. Our attunement ebbs and flows with the environment; it is a total sensorial experience.

Sensory-based attention to the environment is a key concept of Direct Realism. As animals, part of our biological endowment is to perceive information—using all our perceptual abilities—directly from our (social) environment; without this mechanism, we would (socially) perish. Such perceptual mechanisms are shaped by our ancestral lineages, which pass (social) information down to us, and by our own (social) experiences with the natural world. (I’m just a sociophonologist being transparent about how my bias reads this!).

Critics of Direct Realism claim that it’s mostly about nature and not enough about nurture. And while Gibson did start calling it an Ecological theory of perception, he envisioned a symbiotic relationship with nature, one that focused on “affordances”– what the environment could provide the perceiver, and what the perceiver could return back to their environment:

The natural environment offers many ways of life, and different animals have different ways of life. […] In architecture a niche is a place that is suitable for a piece of statuary, a place into which the object fits. […] In ecology a niche is a setting of environmental features that are suitable for an animal, into which it fits metaphorically. The niche implies a kind of animal, and the animal implies a kind of niche. Note the complementarity of the two. But note also that the environment as a whole, with its unlimited possibilities existed, prior to animals. The physical, chemical, meteorological, and geological conditions of the surface of the earth and the pre-existence of plant life are what make animal life possible. They had to be invariant for animals to evolve. […](p. 69)

This re-communion with the natural world, which precedes us, is an affront to the commonly held and persistent belief that the mind is computational in nature, simply extracting information from its environment for individual use. But what does all this have to do with language, you’re wondering? Glad you’re still with me!

First, there is no such thing as “speaking a language.” That’s an umbrella term for the wide array of dialectal varieties any one language contains (Chambers & Trudgill, 2004). Try as armies and navies might to claim there is one stable national dialect (usually a standard), most varieties exhibit characteristics found locally within regions and internationally (Caballero et al., 2009), showing little concern for nation-state borders. “Language” leans towards pan-linguistic ideas of what is essentially highly specific dialectal speech.

 Second, our multi-dialectal knowledge is land-based knowledge, meaning it comes directly from our experiences on the land. The way we speak reflects our geographic (diastratic) experiences, specifically our regional (diatopic) and ethnic identities, and is specific to the time period (diachronic) when our conversations took place (Flydal, 1951). Even the learning of a standard dialect is rooted in the place where it was learned: educational institutions.

And if exemplar theory is right—that each instance of dialectal learning leaves a memory trace (Goldinger, 1996), then our linguistic repertoire can no longer be imagined as isolated categories; instead, it is a “chorus of voices” (Tarone, 2007, p. 842), representing the sum of our listening experiences with friends, families, and colleagues. This transforms speech production from being just the intersection of our physiology (e.g., vocal tract size, etc.) with our socialization patterns (Labov, 2006), into a realm where our speech re-enacts and reanimates all the conversations that we’ve entertained over our lifetime. Since our physiology is primarily determined by our ancestral lineages, just as Gibson theorized earlier, our interactions become the embodiment of the lands that we’ve known and the people who have populated them. Gibson is coaxing us out of the lab, and into direct contact with the “real” world.

Because perception is so closely linked with the environment in Ecological Theory, it asks humans to consider what the natural world can teach us about our cognition. In the same way that dialects transcend nation-state borders, so too do plants and animals traverse the man-made boundaries imposed on land. Ecological botanists, like Kimmerer, encourage us to examine how the overlooked world of plant life can teach us about human connection:

When we think about what mosses are, one of the ways to characterize them is by what they don’t have in comparison to all the plants that are around us. They don’t have roots. They don’t have flowers. They don’t have the xylem and phloem – that vascular tissue that allows water to be moved within the plant. They don’t have any of that. And yet they’re able to occupy virtually every habitat on the planet and endure all different kinds of environments. They’re super simple, but in their simplicity is the key to their success. [They don’t root, they] cling. They have these little threadlike structures called rhizoids which allow them to attach, but they’re not absorptive the way roots are. They don’t have the capacity to take up water and nutrients. They’re really just points of attachment.

Those fleeting moments of human connection, where our syntax undulates through the air and attaches our minds together, are moments of perceptual attunement with one another.

If we put it all together, cognition is an act of pure sensorial engagement, implicating all of our perceptual and emotional processes, including our relationship to our environment. Gibson’s work gives cognitive psychology permission to move out of the computational psycholinguistic realm that overly dominates the SLA field (Firth & Wagner, 1997).

For a supernatural being, able to transcend time and space, Melquíades is easily hoodwinked by shiny objects: too much time looking through that telescopic lens can render us myopic, if we’re not careful. It was never about the telescope or how close or distant we felt from one another. The lens was just a tool.

What it’s always been about is our ability for attunement:

To turn our perceptors on,

tune in to each other’s wavelength,

and drop out into each other’s microcosmic worlds.


Best, C. T. (1995). A direct realist view of cross-language speech perception. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 171–204). Baltimore: York Press.

Caballero, M., Moreno, A., & Nogueiras, A. (2009). Multidialectal Spanish acoustic modeling for speech recognition. Speech Communication, 51, 217-229.

Chambers, J.K., & Trudgill, P. (2004). Dialectology (2nd ed.). Cambridge, UK: Cambridge University Press.

Costanzo, F. S., Markel, N. N., & Costanzo, P. R. (1969). Voice quality profile and perceived emotion. Journal of Counseling Psychology, 16(3), 267–270. https://doi.org/10.1037/h0027355

Firth, A., & Wagner, J. (1997). On discourse, communication, and (some) fundamental concepts in SLA research. Modern Language Journal, 81, 285-300.

Flydal, L. (1951). Remarques sur certains rapports entre le style et l’état de langue. Norsk Tidsskrift for Sprogvidenskab, 16, 241-258.

Gibson, J. J. (1966). The senses considered as perceptual systems. Prospect Heights: Waveland Press, Inc.

Goldinger, S. D. (1996). “Words and voices: Episodic traces in spoken word identification and recognition memory”. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 1166-1183.

Hardison, D. M. (1999). Bimodal speech perception by native and nonnative speakers of English: Factors influencing the McGurk effect. Language Learning, 49(Suppl. 1), 213–283.

Hardison, D.M. (2003). Acquisition of second-language speech: Effects of visual cues, context, and talker variability. Applied Psycholinguistics, 24(4), 495–522.

Hattori, T. (1987). A study of nonverbal intercultural communication between Japanese and Americans—Focusing on the use of the eyes. Japan Association of Language Teachers, 8, 109– 118.

Labov, W. (2006). A sociolinguistic perspective on sociophonetic research. Journal of Phonetics, 34, 500-515.

Massaro, D. W., Cohen, M. M., & Gesi, A. T. (1993). Long-term training, transfer, and retention in learning to lipread. Perception & Psychophysics, 53, 549–562.

Tarone, E. (2007). Sociolinguistic approaches to second language acquisition research–1997-2007. The Modern Language Journal, 91(S1), 837-848.

Thomson, R. I. (2018). High variability [pronunciation] training (HVPT): A proven technique about which every language teacher and learner ought to know. Journal of Second Language Pronunciation, 4(2), 207-230.

Leave a Reply

Your email address will not be published. Required fields are marked *