Perceptions involving a number of senses arise seemingly effortlessly out of our cognitive and neural architecture. We can have a percept of a tree that involves information about the tree’s colour and smell, or we experience both the taste and the texture of a delicious chocolate cake. The ability to integrate sensory information from separate sense modalities into a unified percept is key in our construction of reality. What is less commonly experienced among human adults is information from a single sensory modality giving rise to percepts that involve more than that single sensory modality. This is called synesthesia, a term we are all likely familiar with, and it refers to the experience of multi-modal percepts that are structured out of uni-modal information (Grossenbacher & Lovelace, 2001).
Typically, our senses are organised in such a way that sensory receptors are stimulated by a specific type of information (light for vision, pressure for touch, etc.) which is passed from the periphery of the body to the brain via these receptors’ axons. The specificity of these receptors as well as the connections they make to distinct regions of the cortex permit incoming information to be modality-specific (Kandel, Schwartz, & Jessel, 2000). Yet, in synesthesia it seems to be the case that information in one sensory modality is non-specific. Below I will explore two prominent neurocognitive models of synesthesia with particular attention paid to sound-colour synesthesia. But first, I will introduce the case of IW, a sound-visual synesthete with whom I recently spoke about his experiences. With the aid of examples from IW’s experiences as well as some empirical research, I will then try to highlight the main ideas of current neurocognitive models of synesthesia and evaluate a proposed third model.
IW is a music student and amateur musician who experiences sound-colour and sound-location synesthesia. His experience of sound likely differs from many of our experiences, as he says “it’s as if there is a running blank canvas in my mind’s eye, and it gets filled by sound with abstract images.”. For IW, nearly any sound he can think of has a corresponding visual: A creak in the wall makes a point on the canvas or the train going by makes a line across the canvas. This sound-visual experience is automatic, stable, and has been present for as long as he can remember. When IW is playing music, he pays particular attention to the images that co-occur with the sounds he is creating. He uses, for example, the image of a 2-note interval to reinforce the accuracy or correctness of its perception by simultaneously attending to the sound and the spacing between the two images evoked by the sound. Or, when trying to listen to a single note in a chord, he feels that he really hears it when he can also see its image more clearly than the images evoked by the other notes. Overall, IW reports that he usually needs both the sound and the visual to feel confident about what he is hearing: if he thinks he hears a certain note but cannot see its typical form, then there is room for him to doubt what he is hearing. How have researchers attempted to explain perceptual experiences like IW’s?
Feedback Disinhibition versus Excess Synapses
Two prominent models for explaining the occurrence of synesthesia are a disinhibited-feedback model and an excess synapse model. These models are often presented in an either-or fashion, but might not be mutually exclusive after all. The disinhibited-feedback model begins from the observation that the cortex is structured according to a hierarchy of parallel pathways and networks (Grossenbacher & Lovelace, 2001). These parallel pathways and networks are comprised, in part, of sensory pathways that terminate in particular regions of the cortex. In the case of sound, auditory information passes through a series of subcortical nuclei before arriving at the primary auditory cortex (A1) in the temporal lobe. Similarly, visual information is relayed from the retina through the thalamus (subcortical nuclei) to the primary visual cortex (V1) in the occipital lobe. These pathways work in a parallel fashion with information remaining largely separate until arriving at areas of the cortex known as association cortex. Association cortex regions are areas of information integration from different sense pathways (modalities) as well as memory, and they have a critical role in creating a unified percept for us (Kandel, Schwartz, & Jessell, 2000). While association regions receive input from multiple modalities, certain association regions have been identified as having feedback connections to single modality regions (Grossenbacher & Lovelace). The main idea of the disinhibited feedback model is thus: association regions in the cortex represent spatial regions of convergence of sensory pathway information. When input from a single sensory pathway reaches association areas, an inhibitory signal will be sent from the association region to the other sensory pathways. But, if these inhibitory connections are altered such that a disinhibitory feedback signal is relayed, then a top-down activation of other sensory pathways is possible (Grossenbacher & Lovelace). So, when a typical listener hears sound, association regions may inhibit other sensory pathways so that sound is the only coherent percept. But, in the case of a sound-visual synesthete, connections from association regions to unimodal visual regions might fail to be inhibited. This could give rise to simultaneous visual and auditory percepts.
While this model seems quite convoluted, the authors maintain that it is a biologically plausible model as it does not require additional or non-typical connecting structures between sensory pathways. Instead, this model involves a mere alteration in the feedback from association regions to a unimodal sensory area. In contrast to this idea, a model of excess synapses views synesthesia as an altered developmental process whereby excess synapses connecting unimodal sensory regions are not pruned across time (Hubbard & Ramachandran, 2005). Under this model, sensory modalities are said to be “cross-activated” at a lower level of the processing hierarchy than in the disinhibition of feedback model (Hubbard & Ramachandran), which means fairly extensive structural differences between the brains of synesthetes and non-synesthetes should exist. This is the type of atypical structural addition that the disinhibition of feedback model tries to avoid.
These two models have been generated based largely on the study of grapheme-colour synesthetes. However, a study from Zamm, Schlaug, Eagleman, and Loui (2013) looked at white matter pathway density between auditory and visual processing regions in sound-colour synesthetes. They found evidence of right-lateralised increased white matter density between auditory and visual association regions in synesthetes relative to controls. Moreover, pathway density was correlated with behavioural scores of synesthesia. The problem of interpreting the results through one model or another was made explicit by these authors: altered connectivity between association regions provides some support for the disinhibited feedback model while increased connectivity between sensory systems provides some support for the excess synapse model (Zamm et al., 2013). Other studies have been conducted with sound-colour synesthetes and sound-taste synesthetes (Beeli et al, 2005), but neither model has seen significantly more empirical evidence in its favour over the other.
To this end, a model of synesthesia based on memory has been proposed by a number of researchers. Hupé and Dojat (2015) provide a critical review of studies aiming to support the two above-discussed models, ultimately positing that an equally plausible model of synesthesia is that associations across sensory modalities are made more explicitly in childhood. This process, they argue, involves the stabilisation of multi-sensory associations over time. A model that posits early memory formation as a basis for synesthesia rather than a more perceptually-driven neural based account can still account for the synesthete’s experience of having always had synesthetic percepts. However, I think it falls short in explaining synesthesia in a number of ways. First, this account needs to explain how or why certain individuals make these cross-modal associations at all. It seems likely this will come down to some degree of reliance on neural predispositions since synesthesia is thought to have a genetic component (Hubbard and Ramachandran, 2005) and one must show some level of consistency in the etiology of synesthesia. As far as I know, common extra-neural factors have not been established across synesthetes. Second, this model suggests a highly robust form of memory that does not seem subject to degradation or alteration: synesthetic experiences are both stable and robust over time. As IW puts it, he cannot always immediately tell you what colour a sound is, but he can definitely always tell you what colour it is not. There seems to be a degree of certainty and absoluteness to the synesthetic experience that, one could argue, does not entirely fit with our notion of memory as a flexible, mutable process. One way to study this claim would be to try to extinguish a synesthete’s cross-modality pairings. Third, synesthetes do not typically report memories of creating cross-modality associations. For IW, the visual experience of sound is just how it has always been. Even if we grant that the associations occur early in development and therefore remain inaccessible as episodic memories, is it not likely that some association would be remembered by some synesthete at some point? Finally, the memory-based model cannot seem to account for novel cross-modal pairings. IW recounts that as he has begun training his musical abilities more seriously in adulthood, his visual experiences have changed as a result of his sharpened auditory discrimination. As his ability to discriminate notes more accurately or listen more attentively to rhythms and intervals, the corresponding visuals have adapted alongside without conscious effort to alter them. Similarly, this claim seems testable by subjecting synesthetes to novel sensory stimuli and observing their cross-modality reactions.
Overall, the two prominent models of synesthesia ask an important question: does synesthesia arise from special neural structures or does it arise from smaller changes to typical neural structures? Cross-modal mappings between the senses at a less-automatic level is a common ability among humans. Indeed, IW often downplays the “specialness” of synesthesia by noting that among musicians, visual and tactile concepts like round are often used to describe how someone would like a note to sound. The visual/tactile concept as applied to the auditory domain is easily understood by all. But I think the experience of synesthesia, like in the case of IW, is worthy of particular attention. For as much as I cannot imagine what this type of consistent dual-modality perception would be like, IW says he cannot imagine what sound is like without visuals. Exploring synesthesia helps us draw points and lines in the parameter space of what is possible in perception and consciousness, and it reminds us how much private realities can diverge.
Beeli, G., Esslen, M., & Jäncke, L. (2005). Synaesthesia: when coloured sounds taste sweet. Nature, 434(7029), 38. http://doi.org/10.1038/434038a
Grossenbacher, P. G., & Lovelace, C. T. (2001). Mechanisms of synesthesia: Cognitive and physiological constraints. Trends in Cognitive Sciences, 5(1), 36–41. http://doi.org/10.1016/S1364-6613(00)01571-0
Hubbard, E. M., & Ramachandran, V. S. (2005). Neurocognitive mechanisms of synesthesia. Neuron, 48(3), 509–520. http://doi.org/10.1016/j.neuron.2005.10.012
Hupé, J.-M., & Dojat, M. (2015). A critical review of the neuroimaging literature on synesthesia. Frontiers in Human Neuroscience, 9(103), 1-37.
Kandel, E., Schwartz, J.H., & Jessell, T.M. (2000). Principles of Neural Science, 4th ed. USA: McGraw-Hill Companies.
Zamm, A., Schlaug, G., Eagleman, D. M., & Loui, P. (2013). Pathways to seeing music: enhanced structural connectivity in colored-music synesthesia. Neuroimage, 74, 359-366.