Synesthesia and Music

Perceptions involving a number of senses arise seemingly effortlessly out of our cognitive and neural architecture. We can have a percept of a tree that involves information about the tree’s colour and smell, or we experience both the taste and the texture of a delicious chocolate cake. The ability to integrate sensory information from separate sense modalities into a unified percept is key in our construction of reality. What is less commonly experienced among human adults is information from a single sensory modality giving rise to percepts that involve more than that single sensory modality. This is called synesthesia, a term we are all likely familiar with, and it refers to the experience of multi-modal percepts that are structured out of uni-modal information (Grossenbacher & Lovelace, 2001).

SensesTypically, our senses are organised in such a way that sensory receptors are stimulated by a specific type of information (light for vision, pressure for touch, etc.) which is passed from the periphery of the body to the brain via these receptors’ axons. The specificity of these receptors as well as the connections they make to distinct regions of the cortex permit incoming information to be modality-specific (Kandel, Schwartz, & Jessel, 2000). Yet, in synesthesia it seems to be the case that information in one sensory modality is non-specific. Below I will explore two prominent neurocognitive models of synesthesia with particular attention paid to sound-colour synesthesia. But first, I will introduce the case of IW, a sound-visual synesthete with whom I recently spoke about his experiences. With the aid of examples from IW’s experiences as well as some empirical research, I will then try to highlight the main ideas of current neurocognitive models of synesthesia and evaluate a proposed third model.

IW is a music student and amateur musician who experiences sound-colour and sound-location synesthesia. His experience of sound likely differs from many of our experiences, as he says “it’s as if there is a running blank canvas in my mind’s eye, and it gets filled by sound with abstract images.”. For IW, nearly any sound he can think of has a corresponding visual: A creak in the wall makes a point on the canvas or the train going by makes a line across the canvas. This sound-visual experience is automatic, stable, and has been present for as long as he can remember. When IW is playing music, he pays particular attention to the images that co-occur with the sounds he is creating. He uses, for example, the image of a 2-note interval to reinforce the accuracy or correctness of its perception by simultaneously attending to the sound and the spacing between the two images evoked by the sound. Or, when trying to listen to a single note in a chord, he feels that he really hears it when he can also see its image more clearly than the images evoked by the other notes. Overall, IW reports that he usually needs both the sound and the visual to feel confident about what he is hearing: if he thinks he hears a certain note but cannot see its typical form, then there is room for him to doubt what he is hearing. How have researchers attempted to explain perceptual experiences like IW’s?

synFeedback Disinhibition versus Excess Synapses

Two prominent models for explaining the occurrence of synesthesia are a disinhibited-feedback model and an excess synapse model. These models are often presented in an either-or fashion, but might not be mutually exclusive after all. The disinhibited-feedback model begins from the observation that the cortex is structured according to a hierarchy of parallel pathways and networks (Grossenbacher & Lovelace, 2001). These parallel pathways and networks are comprised, in part, of sensory pathways that terminate in particular regions of the cortex. In the case of sound, auditory information passes through a series of subcortical nuclei before arriving at the primary auditory cortex (A1) in the temporal lobe. Similarly, visual information is relayed from the retina through the thalamus (subcortical nuclei) to the primary visual cortex (V1) in the occipital lobe. These pathways work in a parallel fashion with information remaining largely separate until arriving at areas of the cortex known as association cortex. Association cortex regions are areas of information integration from different sense pathways (modalities) as well as memory, and they have a critical role in creating a unified percept for us (Kandel, Schwartz, & Jessell, 2000). While association regions receive input from multiple modalities, certain association regions have been identified as having feedback connections to single modality regions (Grossenbacher & Lovelace). The main idea of the disinhibited feedback model is thus: association regions in the cortex represent spatial regions of convergence of sensory pathway information. When input from a single sensory pathway reaches association areas, an inhibitory signal will be sent from the association region to the other sensory pathways. But, if these inhibitory connections are altered such that a disinhibitory feedback signal is relayed, then a top-down activation of other sensory pathways is possible (Grossenbacher & Lovelace). So, when a typical listener hears sound, association regions may inhibit other sensory pathways so that sound is the only coherent percept. But, in the case of a sound-visual synesthete, connections from association regions to unimodal visual regions might fail to be inhibited. This could give rise to simultaneous visual and auditory percepts.

While this model seems quite convoluted, the authors maintain that it is a biologically plausible model as it does not require additional or non-typical connecting structures between sensory pathways. Instead, this model involves a mere alteration in the feedback from association regions to a unimodal sensory area. In contrast to this idea, a model of excess synapses views synesthesia as an altered developmental process whereby excess synapses connecting unimodal sensory regions are not pruned across time (Hubbard & Ramachandran, 2005). Under this model, sensory modalities are said to be “cross-activated” at a lower level of the processing hierarchy than in the disinhibition of feedback model (Hubbard & Ramachandran), which means fairly extensive structural differences between the brains of synesthetes and non-synesthetes should exist. This is the type of atypical structural addition that the disinhibition of feedback model tries to avoid.

These two models have been generated based largely on the study of grapheme-colour synesthetes. However, a study from Zamm, Schlaug, Eagleman, and Loui (2013) looked at white matter pathway density between auditory and visual processing regions in sound-colour synesthetes. They found evidence of right-lateralised increased white matter density between auditory and visual association regions in synesthetes relative to controls. Moreover, pathway density was correlated with behavioural scores of synesthesia. The problem of interpreting the results through one model or another was made explicit by these authors: altered connectivity between association regions provides some support for the disinhibited feedback model while increased connectivity between sensory systems provides some support for the excess synapse model (Zamm et al., 2013). Other studies have been conducted with sound-colour synesthetes and sound-taste synesthetes (Beeli et al, 2005), but neither model has seen significantly more empirical evidence in its favour over the other.

post itTo this end, a model of synesthesia based on memory has been proposed by a number of researchers. Hupé and Dojat (2015) provide a critical review of studies aiming to support the two above-discussed models, ultimately positing that an equally plausible model of synesthesia is that associations across sensory modalities are made more explicitly in childhood. This process, they argue, involves the stabilisation of multi-sensory associations over time. A model that posits early memory formation as a basis for synesthesia rather than a more perceptually-driven neural based account can still account for the synesthete’s experience of having always had synesthetic percepts. However, I think it falls short in explaining synesthesia in a number of ways. First, this account needs to explain how or why certain individuals make these cross-modal associations at all. It seems likely this will come down to some degree of reliance on neural predispositions since synesthesia is thought to have a genetic component (Hubbard and Ramachandran, 2005) and one must show some level of consistency in the etiology of synesthesia. As far as I know, common extra-neural factors have not been established across synesthetes. Second, this model suggests a highly robust form of memory that does not seem subject to degradation or alteration: synesthetic experiences are both stable and robust over time. As IW puts it, he cannot always immediately tell you what colour a sound is, but he can definitely always tell you what colour it is not. There seems to be a degree of certainty and absoluteness to the synesthetic experience that, one could argue, does not entirely fit with our notion of memory as a flexible, mutable process. One way to study this claim would be to try to extinguish a synesthete’s cross-modality pairings. Third, synesthetes do not typically report memories of creating cross-modality associations. For IW, the visual experience of sound is just how it has always been. Even if we grant that the associations occur early in development and therefore remain inaccessible as episodic memories, is it not likely that some association would be remembered by some synesthete at some point? Finally, the memory-based model cannot seem to account for novel cross-modal pairings. IW recounts that as he has begun training his musical abilities more seriously in adulthood, his visual experiences have changed as a result of his sharpened auditory discrimination. As his ability to discriminate notes more accurately or listen more attentively to rhythms and intervals, the corresponding visuals have adapted alongside without conscious effort to alter them. Similarly, this claim seems testable by subjecting synesthetes to novel sensory stimuli and observing their cross-modality reactions.

Overall, the two prominent models of synesthesia ask an important question: does synesthesia arise from special neural structures or does it arise from smaller changes to typical neural structures? Cross-modal mappings between the senses at a less-automatic level is a common ability among humans. Indeed, IW often downplays the “specialness” of synesthesia by noting that among musicians, visual and tactile concepts like round are often used to describe how someone would like a note to sound. The visual/tactile concept as applied to the auditory domain is easily understood by all. But I think the experience of synesthesia, like in the case of IW, is worthy of particular attention. For as much as I cannot imagine what this type of consistent dual-modality perception would be like, IW says he cannot imagine what sound is like without visuals. Exploring synesthesia helps us draw points and lines in the parameter space of what is possible in perception and consciousness, and it reminds us how much private realities can diverge.


Beeli, G., Esslen, M., & Jäncke, L. (2005). Synaesthesia: when coloured sounds taste sweet. Nature, 434(7029), 38.

Grossenbacher, P. G., & Lovelace, C. T. (2001). Mechanisms of synesthesia: Cognitive and physiological constraints. Trends in Cognitive Sciences, 5(1), 36–41.

Hubbard, E. M., & Ramachandran, V. S. (2005). Neurocognitive mechanisms of synesthesia. Neuron, 48(3), 509–520.

Hupé, J.-M., & Dojat, M. (2015). A critical review of the neuroimaging literature on synesthesia. Frontiers in Human Neuroscience, 9(103), 1-37.

Kandel, E., Schwartz, J.H., & Jessell, T.M. (2000). Principles of Neural Science, 4th ed. USA: McGraw-Hill Companies.

Zamm, A., Schlaug, G., Eagleman, D. M., & Loui, P. (2013). Pathways to seeing music: enhanced structural connectivity in colored-music synesthesia. Neuroimage, 74, 359-366.



  1. I am very interested to understand synesthesia and I found this post very interesting and understandable. Will follow further blogs with enthusiasm!


  2. This was an interesting read! I liked how you had an example as well as summaries of the models.

    I also find the memory model intuitively unsatisfactory, but as much as it feels wrong, I was thinking that memory could be involved in the other models. For example, perhaps the extra synapses are present because of some association that was learned during the explosion of synaptogenesis in the first few years of life. You mentioned how the memory model has trouble explaining how synesthetic perceptions can change or become more fine-tuned – could it be that early learning is involved in creating the initial connections, and the more fine-grained associations can arise later because the basic connections are there? Does this make sense from a neuroscience perspective?

    On the other hand, I’ve heard that synesthesia runs in families, and this would suggest that experience cannot fully explain why it arises. I found a study suggesting that both genetic and environmental factors can be involved – they found that identical twins were more likely than fraternal twins to both have synesthesia, but that identical twins were not 100% concordant (Bosley & Eagleman, 2015).

    It’s also interesting to see that there might be a shared genetic component between synesthesia and absolute pitch (Gregerson et al, 2013). Just out of curiosity, does IW have absolute pitch?

    Sorry for the long comment… it was just too interesting 🙂

    Bosley, H. G., & Eagleman, D. M. (2015). Synesthesia in twins: Incomplete concordance in monozygotes suggests extragenic factors. Behavioural brain research, 286, 93-96.

    Gregersen, P. K., Kowalsky, E., Lee, A., Baron-Cohen, S., Fisher, S. E., Asher, J. E., … & Li, W. (2013). Absolute pitch exhibits phenotypic and genetic overlap with synesthesia. Human molecular genetics, 22(10), 2097-2104.

    Liked by 1 person

    1. Thanks for the thoughts!

      As far as I know, IW does not have absolute pitch…and AP and synesthesia do not run in the family, as far as anyone else has divulged.

      With your question about cross-modal associations, I think the idea of the excess synapse model posits something pretty close to what you’re saying. It holds that during early development, sensory modalities are interconnected in such a way that the “perceptual norm”, if you will, is synesthesia. Then, as development continues, instead of being pruned to give rise to uni-modal channels, some of these inter-modality connections stick around. So I take it that by default, two or more sensory modalities just are sensitive to a single type of sensory input whether the specific percepts have been associated in the past or not. Through this model, two sensory channels are functionally one and no appeal to associative learning is required.

      More specifically to your question though, I’ll answer with a series of musings, confusions, and questions that I now have about this (gah!). It’s not apparent to me why a previously-made, specific sound-colour pairing would support an automatic and specific sound-colour pairing in a novel sound situation. What would be the reasoning for this sort of generalisation from one specific pairing to a new one? The perceptually-based model seems cleaner. I also wonder how associative learning can explain the cross-modality pairings in early development unless people were exposed to specific combinations of stimuli repeatedly. Isn’t it likely that a number of sounds would be combined with a number of colours over a certain period? If there is no predictability or stability to the pairings, why would some stable pairings emerge? Without these repeated pairings, it’s not clear how a stable cross-modality base could be formed at all to ground the synesthetic experience. The perceptually-grounded models strike me as less cumbersome than a memory model or a model that requires associative memory as its basis, but I’m struggling to think through all this clearly, so maybe you or someone else have more detailed insights to add!


  3. Great post Shannon, I enjoyed reading it. It got me thinking about synaesthesia and perfect pitch.

    I had heard that there is a theory that considers perfect pitch to be a form of synaesthesia, but when I did a little research into it, it seems that they are different things, although they may be related. While people with absolute pitch seem to experience note chroma in a qualitatively similar way to how most people experience colour (Sacks, 2007), they don’t experience the linking of sound with colour, as synaesthetes do (Loui, Zamm, and Schlaug, 2012).

    In an fMRI study comparing brain activity when listening to music in people with AP and people with tone-colour synaesthesia, Loui, Zamm and Schlaug (2012) showed that there may be some similarities in neural mechanisms between the two groups, as well as some differences. They concluded that both groups seemed to possess the same kind of increased sensory activation when listening to music, but in the AP subjects this was reflected in an increase in auditory processing, while the synaesthesia subjects showed an increase in visual processing. In addition, similarities in the phenotypes for AP and synaesthesia have been pointed out (Gregersen et al., 2013), suggesting a possible genetic link between the two conditions.

    So, tone-colour synaesthesia is not the same as absolute pitch, but as they seem to share some neural mechanisms and possibly share a genetic basis, they do seem to be related. I’m not sure if AP can be considered a type of synaesthesia because it doesn’t seem to be two sense modalities crossing over, but more of a heightened auditory perception. Anyone have any thoughts on this?


    1. References:

      Sacks, O. Musicophilia: Tales of Music and the Brain. 2007. London: Picador.

      Loui, P., Zamm, A., & Schlaug, G. (2012). Absolute pitch and synesthesia: two sides of the same coin? Shared and distinct neural substrates of music listening. In ICMPC: Proceedings/edited by Catherine Stevens…[et al.]. International Conference on Music Perception and Cognition (p. 618). NIH Public Access.

      Gregersen, P. K., Kowalsky, E., Lee, A., Baron-Cohen, S., Fisher, S. E., Asher, J. E., … & Li, W. (2013). Absolute pitch exhibits phenotypic and genetic overlap with synesthesia. Human molecular genetics, 22(10), 2097-2104.


    2. Interesting! I’d never thought about a link between synesthesia and AP before. Thinking of AP as heightened auditory perception seems intuitive. AP is disproportionally higher in visually-impaired populations with estimates of up to 50% of congenitally-blind people possessing AP (as reported in Sacks (2007)…). It sounds like Sacks goes on to suggest that this has led to thoughts of a “disinhibition” on the auditory system, since our typically-dominant sense is vision, which leads to enhanced auditory abilities. Maybe some comparison can be made to the disinhibition feedback model of synesthesia here though, whereby auditory regions are “freed”, to speak loosely, of visual system dominance and are consequently able to develop to a finer resolution. This could be like the disinhibition of unimodal sensory pathways in synesthesia. In both cases, disinhibition from single-sensory region/pathway control permits for a unique, non-typical perceptual experience.

      I think the reproduction ability in AP is quite fascinating. Even in our dominant sense, it seems like most of us don’t possess a strong ability to accurately reproduce visuals through drawing. Maybe we are better at accurately visualising something we’ve just seen, but that seems hard to evaluate. The auditory-motor link in AP is impressive, and I wonder if the number of people who can very accurately (without much training) reproduce visuals is similar to the number of people who can accurately reproduce pitch like in AP. What would this sort of thing say about audiomotor connections versus visuomotor connections?

      Sacks, O. Musicophilia: Tales of Music and the Brain. 2007. London: Picador.

      Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s