- 1 1. What is mental imagery?
- 2 2. Mental imagery in perception
- 3 3. Mental imagery in cognition
- 4 4. Mental imagery in action
- 5 5. Mental imagery in art
1. What is mental imagery?
Close your eyes and visualize an apple. Many readers will have a quasi-perceptual experience that may be a bit similar to actually seeing an apple. For those who do, this experience is an example of mental imagery – in fact, it is the kind of example philosophers use to introduce the concept.
It is not clear whether introducing the term ‘mental imagery’ by example is particularly helpful, for at least two reasons. First, there are well-demonstrated interpersonal variations in mental imagery (see Section 1.2), so much so that some people report no experience whatsoever when closing their eyes and visualizing an apple. Second, it is unclear how such an example like visualizing an apple could be generalized in a way that would give us a coherent concept. It does not seem like mental imagery is an ordinary language term – it was introduced at the end of the 19th century (see Section 1.1 below) as a technical term in psychology and no languages other than English has a term that would mean mental imagery (as distinct from ‘imagination’ or ‘mental picture’). But if ‘mental imagery’ is indeed a technical term, then it is supposed to be used in a way that maximizes theoretical usefulness. In this case, theoretical usefulness means that we should use ‘mental imagery’ in a way that would help us to explain how the mind works.
This encyclopedia entry will not attempt to give an ordinary language analysis of the term ‘mental imagery’, partly because it is far from clear that ‘mental imagery’ is part of the ordinary language. Instead, the focus will be on the theoretically useful concept of mental imagery as it is used for explaining various mental phenomena in psychology, neuroscience and philosophy.
1.1 Mental imagery in the empirical sciences
The concept of mental imagery was first consistently used in the then very new discipline of empirical psychology at the end of the 19th century. At that time, psychologists like Francis Galton, Wilhelm Wundt or Edward Titchener (Galton 1880, Wundt 1912, Titchener 1909) thought of mental imagery as a mental phenomenon characterized by its phenomenology – a quasi-perceptual episode with a certain specific phenomenal feel. This stance lead to serious suspicion, and often the outright rejection, of this concept in the following decades when behaviorism dominated the psychological discourse (Kulpe 1895, Ryle 1949, Dennett 1969). It was not until the 1970s that mental imagery was again considered to be a respectable concept to study in the empirical sciences of the mind.
Just as perception can be characterized in a variety of ways, the same goes for mental imagery. One way of characterizing perception is in terms of its phenomenology: perception would be a mental process that is characterized by a certain specific phenomenology. The problem is that phenomenology is not publicly observable and, as a result, it is not a good starting point for scientific study. The same considerations apply for mental imagery. But we can also characterize perception functionally or neuroanatomically and these ways of thinking about perception would be publicly observable and, as a result, would be a good starting point for the scientific study of perception. And this is exactly how perceptual psychology and the neuroscience of perception proceeds. Again, the same considerations apply for mental imagery.
As a result, in recent decades psychologists and neuroscientists, rather than relying on introspection and phenomenology, characterized mental imagery in functional and neuroscientific terms. Here is a typical characterization from a review article that summarizes the state of the art concerning mental imagery in psychology, psychiatry and neuroscience, published in the flagship journal Trends in Cognitive Sciences: “We use the term ‘mental imagery’ to refer to representations […] of sensory information without a direct external stimulus” (Pearson et al. 2015, p. 590, see also Dijkstra et al. 2019). In short, according to the psychological definition, mental imagery is perceptual representation not triggered directly by sensory input (or representation-involving perceptual processing not triggered directly by sensory input – these two phrases will be used interchangeably in what follows).
The concept of directness may need some further clarification (and the same goes for the concept of “appropriate immediate sensory input” (Kosslyn et al., 1995, p. 1335, see also Shepard and Metzler 1971) that has also been used to specify what mental imagery lacks). The perceptual processing (in the early cortices) is triggered directly by sensory input if it is triggered without the mediation of some other (perceptual or extra-perceptual) processes. If the perceptual processing is triggered by something non-perceptual (as in the case of closing our eyes and visualizing), it is not triggered either directly or indirectly (see Section 1.3). If the perceptual processing in the visual sense modality is triggered by sensory input in the auditory sense modality (as in the case of an involuntary visual imagery of your face when I hear your voice with my eyes closed), the visual processing is triggered indirectly – with the mediation of auditory processing (see Section 2.2). A direct trigger here would be visual input, but there is no visual input in this case. And if the visual processing at the center of the visual field is triggered by input in the periphery of the visual field (say, because the center of the visual field is occluded by an empty white piece of paper), then the visual processing at the center of the visual field is, again, triggered indirectly, that is, in a way mediated by the visual processes in the periphery (see Section 2.1). A direct trigger would be sensory input at the center of the visual field, but there is no such sensory input in this case. According to the psychological definition of mental imagery, all three of these different examples of perceptual processing count as mental imagery as the perceptual processing is not triggered directly by the sensory input.
Contemporary philosophical thinking about mental imagery comes close to this way of defining mental imagery (see Nanay forthcoming for a summary). Gregory Currie, for example says that “episodes of mental imagery are occasions on which the visual system is driven off-line, disconnected from its normal sensory inputs” (Currie 1995, p. 26, see also Kind 2001, Richardson 1969, Currie and Ravenscroft 2002, but see also Section 1.2 below and see Fazekas et al. 2021, for more on the concept of offline perception).
This way of thinking about mental imagery allows us to examine mental imagery empirically and in a publicly observable manner. To put it very simply, if someone’s eyes are closed, so she receives no visual input and her early sensory cortices are nonetheless representing an equilateral triangle at the middle of the visual field (something that can be established fairly easily given the retinotopy of vision by means of fMRI), this is an instance of mental imagery.
This psychological conception of mental imagery is neutral about some seemingly important features of mental imagery. If mental imagery is perceptual representation not triggered directly by sensory input, then mental imagery may or may not be voluntarily triggered (see more on this distinction in Section 1.3 below). Further, it may or may not be conscious – even if you experience nothing, as long as there is a triangle in your primary visual cortex, but no triangle on your retina, you have mental imagery (see more on unconscious mental imagery in Section 1.2 below).
The psychological definition characterizes mental imagery is defined negatively: it is perceptual representation not triggered directly by sensory input. This leaves open the question what it is triggered by. Often it is triggered by higher level cognitive processes – this is the case when you count to three and visualize an apple. But it can also be triggered laterally by different sense modalities (see Section 2.2 below). And it can also be triggered by sensory input but in an indirect manner (see Section 2.1 below).
1.2 Mental imagery vs. images
Mental imagery is often used interchangeably with the term ‘mental image’. This is misleading in more than one way. First of all, mental imagery is not necessarily visual. Just as perception can be visual, auditory, olfactory, tactile, gustatory, etc, the same goes for mental imagery (see, e.g., Young 2020). Auditory mental imagery, for example, plays a crucial role in listening to music (see Section 5.2 below). But it is not an ‘image’ in any meaningful sense of the term.
Second, and even more importantly, not everyone conjures up vivid and distinct images when they have mental imagery (see Kind 2017 for a summary of the vividness of imagery). There are people who, when they close their eyes and visualize an apple see no ‘images’ in their mind’s eye. They are referred to as aphantasics, a label that just means that they report no conscious mental imagery (Zeman et al. 2010). Aphantasia can have many causes, some having to do with voluntary control, some with the phenomenology of early cortical representations. But at least some aphantasics seem to have mental imagery that they are not aware of: they have unconscious mental imagery (Koenig-Robert and Pearson 2021, Nanay 2021c, see also Phillips 2014, Church 2008, Emmanouil and Ro 2014, Brogaard and Gatzia 2017 on unconscious mental imagery).
The very idea of unconscious mental imagery may raise some philosophical eyebrows and some philosophers indeed build consciousness into their definition of mental imagery (Richardson 1969, pp. 2-3, Kung 2010, p. 622). But if mental imagery is perceptual representation that is not directly triggered, then the bar for unconscious mental imagery should not be higher than the bar for perception per se, that is, for perceptual representation that is directly triggered. And we have plenty of evidence that perception is often unconscious: subjects with blindsight are not conscious of what they are staring at, but what they see systematically influences their behavior. And the same goes for healthy subjects when they look at very briefly presented or masked stimuli (see, e.g., Kentridge et al. 1999, Kouider & Dehaene 2007 as two representative examples of the vast literature on unconscious perception). If perception can be unconscious, then so can mental imagery.
Aphantasia comes on a spectrum. Researchers put together the so-called ‘vividness of visual imagery questionnaire’, which indicates how vivid one’s (visual) mental imagery is. Aphantasics score very low on this scale. People with very vivid mental imagery (often referred to as hyperphantasics) score very high. But most people are somewhere in between. These interpersonal variability in the vividness of mental imagery should make us even more wary of using introspective criteria for characterizing mental imagery as this would give different results in different people on different parts of the aphantasia-hyperphantasia spectrum.
1.3 Mental imagery vs. imagination
Mental imagery is not imagination (Langland-Hassan 2015, 2020, Arcangeli 2020). Imagination is (typically) a voluntary act. Mental imagery is not. Mental imagery can be, and is very often, involuntary. When we have flashbacks to an unpleasant scene, this is mental imagery, but not imagination by any sense of the term (see also Gregory 2010, 2014, Wiltsher 2016 on the differences between imagination and mental imagery). It is involuntary mental imagery. The same goes for earworms: annoying tunes that go through our head in spite of the fact that we really don’t want them to. Again, this is not auditory imagination, but it is auditory mental imagery.
In spite of these differences, given that the term ‘mental imagery’ was not systematically used until the end of the 19th century, throughout the history of philosophy people used the term ‘imagination’ to refer to what we now would describe as mental imagery. Thomas Hobbes, for example, talked about “retaining an image of the seen thing”, which comes very close to at least a subcategory of the current use of mental imagery in psychology and neuroscience, but he referred to this mental phenomenon as imagination (Hobbes 1651, Chapter 2). More generally, both the British empiricists and the German idealists used the term ‘imagination’ at least sometimes in the sense that would be captured by the concept of mental imagery nowadays (see Yolton 1996 for a summary). If we want to understand the evolution of philosophical thinking about mental imagery, we would need to go through all the historical texts about imagination and separate out references to voluntary acts (imagination proper) from references to mental imagery. This is not something that can be done in this encyclopedia entry.
The relation between mental imagery and imagination is important for another reason: we have seen that we can have mental imagery without imagination (see the flashback and the earworm examples). But how about the other way round? Can we have imagination without mental imagery? In other words, does imagination necessarily involve the exercise of mental imagery (Kind 2001, Van Leeuwen 2016, Langland-Hassan 2020)? This debate has been further complicated by the standard distinction between sensory and propositional imagination (roughly, imagining seeing x versus imagining that x is F), and the role imagery plays in these two forms of imagination – roughly, the difference between them is that the former, but not the latter is necessarily accompanied by (or constituted by) mental imagery. Without taking sides or venturing into the literature on the distinction between sensory and propositional imagination, it needs to be pointed out that many of the arguments on either side appeal to introspection (Byrne 2007, Chalmers 2002). If we allow for unconscious mental imagery, then these arguments would not lead to any kind of conclusive resolution. The only way in which we can assess whether imagination necessarily involves mental imagery is by empirical means.
1.4 The content of mental imagery
Mental imagery is a form of representation. But what does it represent and how does it do so? If mental imagery is a perceptual or at least quasi-perceptual representation, it seems that it represents the way perceptual states represent. Perceptual states attribute properties to the perceived scene. Mental imagery attributes properties to the imagined scene (or imagined properties to the actual scene). Just what such ‘imagined’ attributed properties could be and how to think of the ‘imagined scene’ are highly controversial questions (see, e. g., Kulvicki 2014, Langland-Hassan 2015). What seems to be less controversial is that both forms of property attributions are underwritten by early cortical processes, and both are sensitive to the allocation of attention (Shea 2018, Dijkstra et al. 2019).
This similarity between perception and mental imagery in terms of content plays an important role in thinking about the phenomenology of these states. One old question concerning the relation between (conscious) perception and (conscious) mental imagery is about the phenomenal similarity between the two (Hume 1739, 1.1.1). Mental imagery can feel similar to perception, so much so that under experimental conditions, it is easy to confuse the two (Perky 1910, see also Hopkins 2012 for a contrasting view). Assuming that the phenomenal character of a state depends in some ways on its content (an assumption that doesn’t need to be as strong as that of intentionalism), we can explain this with reference to the similarity of the content of perception and the content of mental imagery.
Not just the similarities, but the differences between mental imagery and perception also need to be explained. And the difference between perceptual content and the content of mental imagery also plays an important role in the debate about a phenomenologically salient and historically influential difference between the vividness of perception and the vividness of mental imagery. The historically influential view, championed most memorably by the British empiricists, is that mental imagery is paler and less vivid than perception. Even if we set aside hyperphantasics, who report very similar vividness for mental imagery and perception, this distinction does not seem to hold across the board. The properties that constitute the content of mental imagery can be very determinate indeed – and most of the properties that constitute perceptual content are not particularly determinate (see Dennett 1996 for a classic argument). Nonetheless, determinacy plays a role in yet another major difference between perceptual content and the content of mental imagery.
When you look at a landscape and shift your attention from the tree on the left to the mountain range on the right, this implies a change in the determinacy of the perceptually attributed properties: the properties attributed to the tree will be less determinate than before and the properties attributed to the mountain range will be more determinate than before (Yeshurun and Carrasco 1998). Let’s focus on the change in determinacy in the latter case: the extra determinacy of the perceptually attributed properties comes from the sensory input: perceptual attention increases determinacy by means of extracting more information from the sensory input. In the case of mental imagery, in contrast, there is no sensory input to exploit, so when you close your eyes and imagine the same landscape with the tree on the left and the mountain range on the right and you shift your attention from the former to the latter, then the increased determinacy of the properties attributed to the mountain range can’t come from the sensory input. It must come from a top-down source – your beliefs or expectations or memories about mountain ranges (Nanay 2015).
1.5 The format of mental imagery
The format of a representation is different from its content. Two representations can have the same content but different formats. The usual starting point of talking about representational format is the difference between the way pictures and sentences represent. Pictures represent imagistically or iconically and sentences represent non-imagistically or propositionally. They may represent the same thing: say, a red apple on a green table. But they represent this red apple on a green table differently (for example, to just mention one often-emphasized difference, very few parts of the sentence “there is a red apple on a green table” represent part of what the sentence itself represents, whereas many parts of the picture of the red apple on a green table represent part of what the whole picture represents) – the format of the representation is different.
So the question is: does mental imagery represent the way pictures do or the way sentences do? This was the central question of the so-called ‘Imagery Debate’ of the 1980s (in the imagistic corner: Kosslyn 1980, in the propositional corner: Pylyshyn 1981, see Tye 1991 for a good summary). It was this debate that made philosophers take the concept of mental imagery seriously again, after a long period of behaviorist-inspired skepticism about anything imagery-related.
The Imagery Debate is historically significant for yet another reason: it helped us to appreciate how interpersonal variations in mental imagery can have a major impact on one’s philosophical/theoretical positions. An important and fairly large study conducted at a time when the Imagery Debate was on its way out shows this very clearly. It mapped how philosophers’ and psychologists’ intuitions about the format of mental imagery vary as a result of the vividness of their mental imagery. The results showed that the vividness of imagery has significant impact on theoretical commitments in this debate (Reisberg 2003). Researchers with less vivid mental imagery were more likely to take the propositional side and those with more vivid mental imagery tended to come down on the iconic side.
As the dependence on the vividness of one’s mental imagery shows, it is far from clear that the Imagery Debate is a substantive debate, and many psychologists and neuroscientists (including some of the original participants of this debate) explicitly declared this debate dead (see esp. Pearson and Kosslyn 2015). There are many ways of characterizing the distinction between imagistic and propositional formats, some more controversial than others. Appeal to holism or the ‘picture principle’ have been more on the controversial side (Kulvicki 2014). Describing iconic format as “representation of magnitudes, by magnitudes” (Peacocke 2019, p. 52) is on the less controversial side. And at least according to these criteria it seems clear that mental imagery has iconic format.
Perceptual representations represent magnitudes by means of magnitudes. In the case of vision, for example, they represent magnitudes like illumination, contours, color and they do so by means of magnitudes in the early sensory cortices. The early visual cortices are retinotopic (Grill-Spector and Malach 2004 for a summary and Talavage et al. 2004 on equivalent claims regarding the non-visual sense modalities). If you are looking at a triangle, there is a roughly triangle-pattern activation of direction-sensitive neurons in your primary visual cortex. This is iconic format par excellence. And if you visualize a triangle, there is also a roughly triangle-pattern activation of direction-sensitive neurons in your primary visual cortex (Kosslyn et al. 2006). Again, iconic format par excellence, at least according to the ‘representation of magnitudes, by magnitudes’ criteria.
2. Mental imagery in perception
The role of mental imagery in perception has been an important theme in the history of philosophy. We have seen the debate about the phenomenal similarities and differences between mental imagery and perception in Section 1.4. But there is an even more important question about the relation between mental imagery and perception, namely, about whether and in what sense perception depends on mental imagery. This has been a dominant theme in the history of philosophy and Immanuel Kant was probably the most explicit proponent of a fairly strong constitutive dependence claim. Kant famously claimed that imagination is “a necessary ingredient of perception itself” (Critique of Pure Reason, A120, fn. A) and this claim has become quite influential not just in philosophy (Strawson 1974, p. 54, Sellars 1978), but in the history of ideas in general. Eugène Delacroix, for example, wrote in his diary on September 1, 1859 that “Even when we look at nature, our imagination constructs the picture” (see also Briscoe 2018 and Van Leeuwen 2011 for examples of perception/mental imagery hybrid).
As we have seen in Section 1.3, in Kant’s time, imagination and mental imagery were not systematically kept apart and a charitable interpretation of Kant’s claim would be that what is a necessary ingredient of perception itself is not voluntary imagination (as we don’t voluntarily imagine each time we perceive), but rather mental imagery (see Strawson 1974 and Gregory 2017 for discussion of just how charitable such interpretation would be). So the charitable interpretation of the Kantian Thesis is that mental imagery is a necessary ingredient of perception itself.
This is a constitutive claim: perception doesn’t merely depend on mental imagery causally, it depends on mental imagery constitutively. This, like all constitutive claims, is a fairly strong one, and a much more modest, also historically influential, pre-Kantian, view dominant among, for example, the British empiricists, would be that perception does not depend on mental imagery at all, of if it does, it depends on it merely causally. While there has never been an explicit debate between these two positions, recent empirical research helps us to assess the respective merits of these two ways of thinking about the relation between perception and mental imagery.
2.1 Amodal completion
Amodal completion in the visual sense modality is the representation of occluded parts of perceived objects. When we see a cat behind a picket fence, we complete those parts of the cat amodally that are hidden behind the planks. But amodal completion is not just a visual phenomenon: in the auditory sense modality, we amodally complete, for example, beeped out parts of a soundtrack and in the tactile sense modality, we amodally complete the entire shape of the wine glass we hold although we only touch it with the tip of our fingers (see also Young and Nanay forthcoming on olfactory amodal completion). Amodal completion is the representation of those parts of perceived objects that we get no sensory stimulation from (Michotte et al. 1964, Nanay 2018b).
Amodal completion is perceptual representation, as a vast amount of neuroscientific studies show that it happens very early in the sensory cortices, in the visual case it happens in the primary visual cortex (Lee and Nguyen 2001, Ban et al. 2013, Pan et al. 2012, see also Briscoe 2011). And it is not directly triggered by sensory input as the amodally completed shape is not directly triggered by the retinal input – the retinal input that would correspond to the amodally completed contour is empty – no contour on the retina there. In the case of the cat behind the picket fence, the shape of the occluded tail is represented in the primary visual cortex, but there is no corresponding shape on the retina that could have directly triggered this shape representation: the only thing on the part of the retina that would correspond to the shape of the tail is just the monochrome white of the picket fence. Amodal completion is, in this sense, perceptual representation that is not directly triggered by sensory input (a view widely shared among empirical researchers, see van Lier and Ekroll 2020 for a summary).
Is amodal completion a form of mental imagery then? Not everyone thinks so. One could argue that amodal completion is not a perceptual phenomenon at all, but rather a cognitive one: we see the unoccluded parts and then form beliefs about the occluded ones (see Briscoe 2011 for one version of this claim). There are two sorts of reasons to worry about this proposal. First, there are phenomenological worries: it just doesn’t feel as if we merely had beliefs about occluded parts of perceived objects (see, e.g., Noe 2004). Second, there are empirical problems. In particular, this way of thinking about amodal completion does not (and, arguably, could not) explain why the occluded contours show up in early cortical regions of perceptual processing and do so very quickly after stimulus presentation (Sekuler and Palmer 1992, Rauschenberger and Yantis 2001).
And amodal completion is partly constitutive of perception per se. The vast majority of our perceptual states involve amodal completion. Take the visual sense modality: when you look around, you see objects further away from you partly occluded by objects closer to you. So your perceptual system amodally completes these occluded bits of the objects further away from you. But amodal completion is also involved in the representation of the unoccluded objects – you don’t get direct sensory input from the back side of these objects, nonetheless you represent them perceptually – which means you amodally complete the back side of all three-dimensional objects (Bakin et al. 2000, Ekroll et al. 2016). In short, amodal completion is partly constitutive of perception per se. And if amodal completion is indeed a form of mental imagery, then we have reason to think that mental imagery is partly constitutive of perception, just as the charitable interpretation of Kant suggests.
2.2 Multimodal mental imagery
We have seen that the negative definition of mental imagery as perceptual representation that is not directly triggered by sensory input allows for lateral triggering of this perceptual representation. This amounts to perceptual representation in one sense modality, say vision, triggered by sensory input in another sense modality, say, audition. As this is not a direct trigger (your eyes could be closed, so nothing triggers your visual representation directly), this is a form of mental imagery. And it is what is known as multimodal mental imagery (Spence and Deroy 2013, Lacey and Lawson 2013, Nanay 2018a).
One everyday example of multimodal mental imagery is watching the tv muted: your auditory representation is not triggered directly by auditory input (as there is no auditory input), but by visual input (the image on tv). If the person speaking on tv is someone famous whom you have often heard before, you may even have the phenomenal experience of ‘hearing’ this person’s voice ‘in your mind’s ear’. But even if you don’t have this phenomenal experience, your early cortical auditory processes work differently depending on what famous person you see on your muted tv (Pekkola et al. 2005, Hertrich et al. 2011).
Given that most of the things around us are multisensory objects and events, which just means we can get information about them by means of more than one sense modality and given that most of them we do not actually get information about by means of all the possible sense modalities, this means that the norm is that we have multimodal mental imagery of most objects and events around us (even if they are unconscious, we have plenty of evidence that these are instances of unconscious mental imagery, rather than no representation at all, see, e.g., Vetter et al. 2014). This is another important example of why and how perception may depend constitutively on mental imagery, in this case, multimodal mental imagery.
2.3 Unusual forms of mental imagery in perception
Visually impaired people often report having visual mental imagery. And we know that with the exception of cortical blindness, the visual cortices of blind people remain more or less intact. Hence, (non-cortically) blind people can and do have visual mental imagery that is triggered by sensory input in another sense modality, for example, audition or tactile perception (Arditi et al. 1988). In short, blind people can and do have multimodal mental imagery.
The multimodal mental imagery of the visually impaired plays an important role in various ways in which they can navigate their environment. Cane use and brail reading rely on the subject’s visual mental imagery triggered by tactile input as does echolocation, a more and more widespread means by which blind people can learn to gather information about the spatial layout of their environment (by making clicking sounds and use the echo of these sounds as the source of spatial information). It is now known that echolocation relies on processing in the early visual cortices: it is visual mental imagery that is triggered auditorily (Thaler et al. 2011). Finally, sensory substitution devices also create visual mental imagery. These devices consist of a video camera mounted on the head of the blind subject that provides a continuous stream of tactile or auditory input (transferred from the visual input the camera registers – for example gentle needle pokes on the subject’s skin in a pattern that corresponds, in real time, with the image the camera records). This tactile input then leads to processing in the early visual cortices of these blind subject (which then gives rise to an experience that the subjects characterize as visual). In short, what is referred to as sensory substitution assisted perception is in fact another example of multimodal mental imagery (Renier et al. 2005, see Nanay 2017a for a summary).
Another ‘unusual’ form of mental imagery in general and multimodal mental imagery in particular is in synesthesia. Synesthetes report strong visual experiences of a specific color in response to auditory or tactile (or a wide variety of other non-color) experiences. It has been widely debated just what kind of experience synesthetic experience is. Is it a form of perceptual experience (Matthen 2017, Cohen 2017)? Or is it some kind of higher level, cognitive/linguistic experience (Simner 2007)? The problem is that synesthesia doesn’t really seem to fit squarely in any of these categories.
The connection between synesthesia and mental imagery has long been acknowledged: Synesthetes across the board have more vivid mental imagery than non-synesthetes (Barnett and Newell 2008, Price 2009, Amsel et al. 2017) And this difference is modality specific – so lexical gustatory synesthesia subjects have more vivid gustatory mental imagery (but not necessarily more vivid mental imagery in the, say, auditory sense modality (Spiller et al. 2015). Further, synesthesia is very rare among aphantasia subjects (who report no or hardly any mental imagery) and relatively frequent among hyperphantasia subjects (who report very vivid mental imagery) (Zeman et al. 2020). While there is significant variability between the experiences synesthetes report (see Dixon et al. 2004) and some, but not all of these experiences are reported to be very similar to mental imagery, all instances of synesthesia will count as mental imagery understood as perceptual representation formed in response to early cortical processing that is not triggered directly by sensory input (Nanay 2021a). This gives a unified account of synesthesia and can also explain non-standard (but rigorously demonstrated) cases of synesthetic experiences triggered not by sensory input but by imagining sensory input (Spiller and Jansari 2008).
2.4 Pain mental imagery
Perhaps the most useful application of multimodal mental imagery involves pain management. More specifically, one of the most efficient ways of alleviating (chronic) pain is by means of the use of mental imagery in other sense modalities (Fardo et al. 2015, MacIver et al. 2008 and Volz et al. 2015). This presents something of a conundrum: why does mental imagery in, say, the visual sense modality help us with pain?
Pain perception, in textbook cases, starts with the stimulation of pain receptors, known as nociceptors, and this input is then processed in the primary and secondary somatosensory cortices. But sometimes pain processing in the primary and secondary somatosensory cortices is not directly triggered by nociceptors. This would be the equivalent of mental imagery in the context of pain perception – something we could call pain imagery.
The question is then about the relation between pain perception and pain imagery: between processing in the primary and secondary somatosensory cortices that is directly triggered by nociceptors and processing that is not. And the claim that would be structurally similar to the Kantian claim we considered in Section 2.1 and Section 2.2 is that just as visual mental imagery is a crucial ingredient of vision and multimodal mental imagery is a crucial ingredient of perception, pain mental imagery (representation formed as a result of perceptual processing in the primary and secondary somatosensory cortices that is not triggered directly by nociception) is a crucial ingredient of pain perception (in fact, it may even be partly constitutive of it, see Nanay 2017b).
In some instances of pain perception, mental imagery plays an even more central role: for example, phantom limb pain (pain some subjects feel in amputated limbs) consists of cortical pain processing (in S1/S2) that is not triggered by nociceptors (Ramachandran et al. 1995) for the simple reasons that the relevant nociceptors are missing (they have been cut off with the rest of the limb). Further, the thermal grill illusion (where applying warmth to the index and the ring finger and cold to the middle finger triggers strong pain sensation in the middle finger) is also an instance of sensory pain processing that is not triggered by nociception (Defrin et al. 2002). In both cases, as the nociception is missing, there is no pain perception, but only pain imagery.
There may be reasons to generalize the importance of pain mental imagery in pain perception. One important feature of pain perception is that it is extremely dependent on expectations (when you expect painful sensation, a non-painful stimulus can lead to pain sensation, see Sawamoto et al. 2000; Keltner et al. 2006; Ploghaus et al. 1999). If we consider at least some forms of expectations to be (future-oriented) temporal mental imagery (see Section 5.2 for more on expectations and mental imagery), then these results are easily explained.
3. Mental imagery in cognition
Mental imagery is a perceptual phenomenon, but it has important uses in post-perceptual processing and in cognition more generally. Mental imagery is involved in a wide variety of cognitive phenomena and it is deeply intertwined with emotions, memory and even language (see also the rich literature on the role of imagery in inner speech, e.g., Langland-Hassan and Vicente 2018).
3.1 Mental imagery and memory
The concept of mental imagery has played an important role in the philosophy of memory for at least two reasons. First, imagery training improves memory (in fact, findings along these lines sparked the revival of research into mental imagery in the 1960s, see Yates 1966, Luria 1960). Second, and more importantly, a fundamental distinction in the philosophy of memory is drawn between episodic and semantic memory (Tulving 1972). To put it very simply, episodic memory is remembering an experience and semantic memory is remembering a fact. And one way of cashing out this difference is in terms of mental imagery: mental imagery is a necessary feature of episodic memory, but not of semantic memory.
The connection between episodic memory and mental imagery has been supported by a wide variety of empirical findings (see Laeng et al. 2014 for a summary). The loss of the capacity to form mental imagery results in the loss (or loss of scope) of episodic memory (Byrne et al. 2007, see also Berryhill et al. 2007’s overview). An even more important set of findings is that relevant sensory cortical areas are reactivated when we recall an experience (Wheeler et al. 2000, see also Gelbard-Sagiv et al. 2008).
These results show that episodic memory involves the exercise of mental imagery, but it is an open debate whether there is more to episodic memory than mental imagery. Some have argued that episodic memory has some extra ingredients besides mental imagery, for example, some sort of causal chain to the past observed event (e.g., Bernecker 2010). In contrast, some other philosophers of memory claim that episodic memory is really nothing but mental imagery (Michaelian 2016, De Brigard 2014, Hopkins 2018). The claim is that there is no real difference between future-directed mental time travel (that is, imagining the future) and past-directed mental time travel (episodic memory). Whether we go with the stronger or the weaker claim about the importance of mental imagery in memory, understanding memory seems to presuppose understanding mental imagery.
3.2 Mental imagery and emotion
Try to imagine, as vividly as you can, being attacked by a rabid dog, foaming at the mouth, snapping at your feet right there under your desk. The resulting mental imagery is an important form of mental imagery and also an important form of emotional state. More generally, imagery dramatically affects emotions – it seems for instance difficult to make sense of what goes on in the mind of a fearful or angry person without appealing to imagery. On the other hand, the impact of emotions on imagery is equally significant – the imagery that occupies our minds is very often under the control of our dominant emotion, which sometimes alters its fabric and our capacity to control it. In other words, there is a two-way interaction between emotions and mental imagery (see Holmes and Matthews 2010 for a summary).
Recent findings support this picture of the close connection between mental imagery and emotions. For example, imagining an emotionally charged event or person at an emotionally neutral place confers emotional charge to the place (see Benoit et al. 2019). It has been known for a while that seeing a negatively valenced event (say, a fight between two friends of yours) at a neutral place (say, the corridor in front of your office) makes this formerly neutral place inherit the negative valence of the event. So, in the future, when you see the corridor of your office, it triggers slight (or not so slight) negative emotions. The crucial finding is that the same process also takes place even if you merely imagine a negatively valenced event at a neutral place. In short, negatively valenced mental imagery confers valence on various components of the imagined scene, which then remain emotionally valenced.
The degree to which imagery and affective states are intertwined is further emphasized by the mood congruency effect (Blaney 1986, Matt et al. 1992, Gaddy and Ingram 2014). The most famous example of mood congruency effect is mood congruent memory (Loeffler et al. 2013) – we are more likely to recall scary memories when we are scared, for example. But mood congruency also works in the case of mental imagery: your general mood makes it more likely that you form mental imagery that is congruent with your mood. And it makes it less likely that you form mental imagery that is not congruent with your mood. We also encode emotionally salient stimuli in a more detailed manner, which makes it possible to form more vivid mental imagery (Yonelinas & Ritchey 2015, Hamann 2001, LaBar & Cabeza 2006, Phelps 2004).
3.3 Mental imagery and language
Throughout the history of philosophy imagistic mental representations have been routinely contrasted with abstract, linguistic representations (see Yolton 1996 for a summary). So the assumption here is that there is a sharp contrast between two different kinds of mental representations: imagistic ones, like mental imagery and abstract, linguistic ones. And when we talk about the importance of mental imagery in human cognition, the reach of mental imagery is limited as there is an extra layer of mental representations, abstract, linguistic ones, which have nothing to do with mental imagery. This, in fact, may be one of the reasons why the obsessive emphasis on language at the middle of the 20th century sidelined the philosophical study of mental imagery. Either way, the overall picture then is that there is imagistic cognition and there is linguistic cognition and the two have nothing to do with each other. There have been important debates about where to draw the line between these two domains of the mind: almost all imagistic cognition (a broadly Humean picture) vs almost all linguistic cognition (a broadly Wittgensteinian picture).
Empirical findings work against a common presupposition in this debate. We now know that language processing is not completely detachable from imagistic cognition. One important set of findings come from the ‘dual coding theory’ (Paivio 1971, 1986), according to which linguistic representations themselves are partly constituted (or at least necessarily accompanied) by mental imagery and this explains why concrete words (that are accompanied by more determinate mental imagery) are easier to recall than abstract words (that are accompanied by less determinate and in some cases very indeterminate mental imagery).
While Paivio’s dual coding theory posited the importance of mental imagery in linguistic processing to explain the behavioral differences between the recall of concrete and abstract words, we also know a lot about the ways in which linguistic labels change (and speed up) perceptual processes as well as a fair amount about the time scale of this influence. The crucial piece of finding both from EEG and from eye tracking studies is that linguistic labels influence shape recognition in less than 100 milliseconds (Boutonnet and Lupyan 2015, de Groot et al. 2016, Noorman et al. 2018). This means that linguistic and imagistic representations interact at an extremely early stage of perceptual processing – by any account in early cortical processing (see Thorpe et al. 1996 and Lamme and Roelfsema 2000 for the temporal unfolding of visual processing in unimodal cases). All this indicates that imagistic and linguistic cognition are far from being independent from one another – they are deeply intertwined even at the earliest levels of perceptual processing.
3.4 Mental imagery and knowledge
Perception sometimes justifies our beliefs. If you see that it is raining outside, this may justify your belief that it is raining outside. And much of what we know is based on perception. But how about mental imagery? Can mental imagery justify our beliefs? There are two related, but independent questions here. The first one is about whether mental imagery could ever be a source of knowledge or even new information. And the second one is about reliability: if perception is colored by mental imagery, should this give us a more complex picture of perceptual justification?
The first of these questions is about whether mental imagery itself (that is not in conjunction with perception) can lead to knowledge or even to new information. Jean-Paul Sartre, for example, famously claimed that “nothing can be learned from an image that is not already known.” (Sartre 1948, 12) Since on his view “it is impossible to find in the image anything more than what was put into it,” we can conclude that “the image teaches nothing.” (Sartre 1948, 146-7). Sartre was not always making a clear distinction between imagination and mental imagery, so it is not clear whether it is imagination or mental imagery that teaches nothing. Contemporary philosophers tend to raise this issue about imagination (Langland-Hassan 2016, 2020, see also Kind and Kung 2016), but the question from our point of view is whether it is true of mental imagery. And here some examples seem to suggest that even if imagination teaches nothing, mental imagery can and does. When you want to wrap a chocolate box in wrapping paper, you look at it and form (often involuntarily, not by counting to three and voluntarily imagining) visual imagery of the wrapping paper needed and you may find your estimation of the size of the paper unexpected or surprising. Maybe it’s larger than you had assumed. Or smaller. Your estimation of the size of the paper needed can be very different before and after forming the mental imagery of the paper covering the chocolate box (and this can, of course, be still different from the size of the paper actually needed, see Gauker forthcoming for more examples of this kind). If we can generalize from this example (see Levin 2006 for discussion), then mental imagery can lead to new information even if imagination cannot.
The second question is about the reliability of perceptual justification. If (see Section 2 above), perception per se is a hybrid between sensory stimulation-driven perception and mental imagery, what does this mean for the concept of perceptual justification? Even if sensory stimulation-driven perception can justify beliefs, if mental imagery does not, then the hybrid state of the two, that is, perception per se, may not be as epistemically innocent as it has been thought (MacPherson 2012). Mental imagery is defined precisely by the lack of direct causal link with the sensory input. In any kind of broadly externalist account of justification, this raises worries about the epistemic work that mental imagery can do (as the reliability of mental imagery is supposed to depend on the directness of the causal link between mental imagery and what the mental imagery is about). This does not mean that mental imagery does no epistemic work as the lack of a direct causal link would be compatible with the mental imagery nonetheless carrying information about the external world reliably – and, arguably, this is exactly what happens in the case of amodal completion (Helton and Nanay 2019). But if we take the importance of mental imagery in perception seriously, we need to examine the reliability of these non-direct causal links of mental imagery.
4. Mental imagery in action
Mental imagery plays an important role in action. It is involved not only in action execution, but also in desires. Further, it can explain many of the biases in our behavior as well.
4.1 Mental imagery vs. motor imagery
We need to keep the concept of mental imagery apart from motor imagery. Motor imagery plays a crucial role in action planning and action execution, but motor imagery is not mental imagery. But how exactly this distinction is to be drawn is subject to debate.
Motor imagery has been traditionally understood as the feeling of imagining doing something. It is sometimes taken to be necessarily conscious, not just by philosophers (Currie & Ravenscroft 1997), sometimes even by psychologists (Jeannerod 1997; see also Brozzo 2017: esp. 243-244 for an overview). And as imagining tends to be a voluntary act, motor imagery is also often taken to be voluntary. So the paradigmatic example here is closing your eyes and imagining reaching for an apple.
There are debates, however, about what this traditional, phenomenological way of zeroing in on motor imagery as the feeling of imagining doing something entails. As it is acknowledged by all involved in this debate, not all imaginative episodes of doing something would count as motor imagery: you somehow need to imagine doing something from a first person, and not a third person perspective. Marc Jeannerod, one of the most important psychologists working on both motor imagery and mental imagery made a distinction (following the practice in sport psychology) between internal (first person) and external (third person) imagery, and only the former would count as motor imagery (the latter would be sensory imagery of me doing something, see Jeannerod 1994, p. 189).
Given that motor imagery, just like mental imagery, can be conscious or unconscious (see, for example, Osuagwu & Vuckovic 2014) and it can also be voluntary or involuntary, there has been a tendency to move away from phenomenological characterization. A more inclusive way of understanding motor imagery is supported by the methodological advice by Jeannerod, who writes: “Motor imagery would be related to motor physiology in the same way visual imagery is related to visual physiology” (Jeannerod 1994, p. 189). And here a better understanding of mental imagery can help us with defining motor imagery.
Mental imagery is the representation that results from perceptual processing that is not triggered directly by sensory input. So we get mental imagery when the first stop of perceptual processing happens without direct sensory input. Motor imagery is to the output what mental imagery is to the input. So we get motor imagery when the last stop in action processing happens without directly triggering motor output. In other words, motor imagery is the representation that results from processing in the motor system (in the motor and premotor cortices) that does not trigger motor output directly.
Another open question about the relation between motor imagery and mental imagery is about whether the former necessarily involves the latter. When we think of conscious examples of motor imagery, it seems that imagining touching the camera of my laptop involves some form of sensory mental imagery (maybe visual imagery of my finger touching the camera, or, maybe, more minimally, proprioceptive mental imagery of my finger being at a different location from where it is now). And empirical studies also show that motor imagery necessarily entails representing the sensory consequences of the imagined action (Kilteni et al. 2018).
4.2 Pragmatic mental imagery
Not only motor imagery, but mental imagery also plays an important role in action execution (see Van Leeuwen 2011). Some of our actions (in fact, most of them) are perceptually guided actions: our perceptual states trigger and guide our action. When we pick up a coffee cup to drink from it, this is a perceptually guided action: our perceptual state represents the spatial location of the cup, which then guides your reaching movement (and does so in real time, if the perceptual state changes, your reaching movement changes immediately without you noticing any of these changes, see e.g., Paulignan et al. 1991).
If, after looking at the coffee cup, you close your eyes and pick up the cup with your eyes closed, your action is not perceptually guided as you do not perceive the coffee cup anymore. It is, in this case, guided by your visual mental imagery. You looked at the cup, you close your eyes, form mental imagery of the cup’s whereabouts (as well as its other properties that are necessary for performing this action, like its weight and size) and your reaching action is guided by this ‘pragmatic mental imagery’ (Nanay 2013).
In the first case, the pragmatic mental imagery was formed on the basis of your perceptual state: you looked at the cup and then you closed your eyes, but it is this visual information that the mental imagery is based on. But pragmatic mental imagery is more than just some kind of echo of sensory input. Suppose that you are in your bedroom and it is pitch dark. You want to switch on the light, but you can’t see the switch. You are nonetheless in a position to switch it on given your memory of the room’s layout and the location of the light switch in it. In this case, your pragmatic mental imagery is formed on the basis of your memory. But pragmatic mental imagery can be triggered by completely non-perceptual means as well, for example, if I blindfold you and then explain to you in great details where exactly the coffee cup is in front of you, how far exactly to the left and how far exactly ahead, and so on. Your pragmatic mental imagery can still guide your action, but it does so without any (visual) input. In our everyday life many of our actions, especially our routine actions, like flossing, are in fact guided by pragmatic mental imagery.
4.3 Mental imagery and desire
Desires are among the prime examples of propositional attitudes. So one would be tempted to think that desires have nothing to do with mental imagery: they are propositional and not imagistic representations. Nonetheless, one of the leading psychological theories of desire, the elaborated intrusion theory, takes mental imagery to be constitutive of desires (Kavanagh et al. 2005, May et al. 2014).
According to the elaborated intrusion theory, forming a desire is a two-step process. First, a mental state intrudes our mind, which represents the desirable state of affairs. This often happens unconsciously, and it is often not clear what triggers this intruding mental state. The second step is that this representation is elaborated with the help of mental imagery. Without this second, elaborating step, which necessarily involves mental imagery, we would not have a desire.
But one does not need to endorse the elaborated intrusion theory of desire to see the close link between desires and mental imagery. Strong occurrent desire is invariably accompanied by vivid mental imagery (Kavanagh et al. 2009). Further, stronger desires (for example, to smoke) are accompanied by more vivid mental imagery (of smoking-related scenes) (Tiffany and Drobes 1990, see also Tiffany and Hakenewerth 1991). Similarly, desire for consuming alcohol can be induced by imagining entering one’s favorite bar, ordering, holding and tasting a cold, refreshing glass of one’s favorite beer. In fact, this guided imaginative episode triggers a stronger desire than actually seeing a glass of beer (Litt and Cooney 1999). More generally, the vividness of mental imagery is correlated with the strength of one’s desire for this thing across a range of desirable substances and activities (May et al. 2008, Harvey et al. 2005, Statham et al. 2011).
Further, mental imagery of neutral scenes, for example, a rose garden, reduces desire for a cigarette in people who are trying to give up smoking (May et al, 2010). Olfactory mental imagery of unrelated odors has the same effect (Versland and Rosenberg 2007). Desire for eating chocolate can also be reduced by mental imagery of neutral scenes (Kemps and Tiggermann 2007, Harvey et al. 2005) and also by engaging involuntary mental imagery (by, for example, modelling clay out of sight (Kemps et al. 2004)). Some of these results show that mental imagery influences desires. Others show that mental imagery is a downstream consequence of desires. In short, if we manipulate mental imagery, the desire changes and if we manipulate desires, the mental imagery changes.
While the elaborated intrusion theory of desire is explicit about the role of mental imagery in desires, other influential empirically plausible accounts of desire (like the reward-based learning account (Schroeder 2004) or the attentional account (Scanlon 1998)) are also consistent with the importance of mental imagery in desires.
4.4 Mental imagery and biased behavior
Some of our behavior is biased: it goes against our reported beliefs. And often we are not fully aware of these biases. Some of these biases are about racial or gender groups. A big question not just in philosophy and psychology, but in the daily running of our society is where these biases come from and what we can do about them. There is some evidence that at least some of these biases have a lot to do with mental imagery.
First of all, a number of empirical studies show that the vividness of mental imagery biases our behavior. If you are deciding between two positive scenarios, the one that brings up the more vivid mental imagery tends to win out. And if you are deciding between two negative scenarios, the one that brings up the less vivid mental imagery tends to win out (Austin and Vancouver 1996, Trope and Liberman 2003, see also the rich literature on construal level theory and also on the effects of the vividness of mental imagery on future discounting, see Parthasarathi et al. 2017, Mok et al. 2020). Here is an example: If a smoker is deciding between smoking a cigarette and not smoking one, the smoking option brings up very vivid and detailed (and emotionally charged) mental imagery. Meanwhile, the non-smoking option doesn’t bring up any mental imagery at all, or if it does it is not at all detailed and not at all vivid (of just sitting there, not smoking). This is why smoking tends to win out, and also why it is often difficult to stop procrastinating activities like playing video games or checking our social media feed: continuing what we have been doing is represented much more vividly than stopping.
Mental imagery can also explain some famous examples of racial bias (Nanay 2021b, see also Sullivan-Bissett 2019, who describes implicit racial bias as unconscious imagination, not imagery). Subjects are more likely to misperceive a phone as a gun if a black person holds it than if a white person does so (Payne 2001). The perceptual state that represents a black person holding a phone gives rise to the mental imagery that represents a black person holding a gun. This mental imagery does not have to be conscious – and when white people rate black people as more dangerous, it is possible that the mental imagery that grounds these judgments is not conscious. The same is true of the biased behavior of standing further away from some people than others in the elevator. The importance of mental imagery in implicit bias is also supported by the fact that one of the most efficient ways of counteracting implicit bias is based on modifying the subject’s mental imagery and the efficiency of these procedures correlates with the vividness of the subjects’ mental imagery (see Lai et al 2013, Blair et al. 2001, Blair 2002, see also Peck et al. 2013 for further relevant findings).
5. Mental imagery in art
The importance of mental imagery can be traced beyond the confines of philosophy of mind. More specifically, mental imagery plays an important role in our engagement with and appreciation of artworks, which makes mental imagery a crucial concept in aesthetics (see also Lopes 2003). While mental imagery may also play a crucial role in artistic creation, as many artists and composers like to emphasize, the focus here will be on the importance of mental imagery in engaging with artworks.
5.1 Mental imagery in the visual arts
A somewhat obvious way in which mental imagery plays a role in our engagement with visual arts follows from the simple fact that most pictorial art does not normally encompass the entire visual field. So those parts of the depicted scene that fall outside the frame, could be, and very often are, represented by means of mental imagery. One famous example would be Edgar Degas, who likes to place the protagonists of his paintings in a way that only parts of them are inside the frame. The rest we need to complete by means of mental imagery. In some extreme cases (e.g., Dancers climbing the stairs, 1886-1890, Musee D’Orsay), we only see someone’s arm or the top of their head and we need to complete those parts of her body that are outside the frame by means of mental imagery. Another example is Buster Keaton, who also uses the viewer’s mental imagery of the off-screen space in his films, but normally for comical effects. One example is the first shot of his short film Cops (1922), where we see the protagonist in close up behind bars and looking depressed. The second shot reveals that he is behind an iron gate talking to a girl who does not love him back (see Burch 1973, pp. 17-31 for more examples of this kind).
But mental imagery is also often used within the picture frame. In the 1950 American film Harvey, the character played by Jimmy Stewart is an alcoholic and he hallucinates a six foot three and a half inch tall rabbit (or pooka). We don’t see anyone, but the Jimmy Stewart character clearly does. And, crucially, all the scenes with the imaginary rabbit are framed as if there really were a rabbit in them. So when we see the Jimmy Stewart character in an armchair having a conversation with Harvey, this shot is framed in a way as if there really were a six foot tall creature next to him. This framing is aesthetically relevant and its choice clearly relies on the viewer’s mental imagery. In this example, we have a fairly good idea what we’re supposed to form a mental imagery of – the Jimmy Stewart character gives a fairly accurate description of Harvey’s alleged appearance. But there are examples where we are in a much less fortunate epistemic situation. One classic example is Bunuel’s Belle de Jour, where the Chinese businessman shows a little box to the Cathrine Denevue character, who is clearly fascinated by what is inside. She sees it, he sees it, but we, the viewers don’t. There is a humming voice coming from the box, but we never see what is inside. We have a very indeterminate (crossmodally triggered) visual mental imagery of what could possibly be in the box – whatever is in the box is left intentionally indeterminate. The French film director, Robert Bresson often uses mental imagery this way, so much so that he even takes this use of mental imagery to be the mark of a ‘good’ director (or, as he would put it, of a cinematographer, not merely of a director): “Don’t show all sides of the object. A margin of indefiniteness” (Bresson 1975/1977, p. 52).
Multimodal mental imagery became a hallmark of 1960s European modernist art films. In some of his films, Jean-Luc Godard used sound primarily as a prompt for triggering visual mental imagery (see Levinson 2014’s sensitive analysis of the use of sound in Masculin/Feminin (1966) from this point of view). And both Bresson and Michelangelo Antonioni used sound this way for much of their career, and they were also very explicit about this way of using sound in their theoretical writings and interviews. As Bresson said, “The eye solicited alone makes the ear impatient, the ear solicited alone makes the eye impatient. Use these impatiences” (Bresson 1975/1977, p. 28) and “A locomotive’s whistle imprints on us a whole railroad station” (Bresson 1975/1977, p. 39). And here is Antonioni giving a textbook definition of multimodal mental imagery: “When we hear something, we form images in our head automatically in order to visualize what we hear” (Antonioni 1982, p. 6). Both Bresson and Antonioni use multimodal mental imagery that is indeterminate and that is also very much emotionally laden. As a counterbalance to this high-brow overkill, it needs to be emphasized that multimodal mental imagery can also be used in a very different manner and still be aesthetically relevant. As Ridley Scott repeatedly emphasizes in his interviews about his Alien trilogy, the Alien is shown relatively rarely because having mental imagery of it is much scarier than seeing it. This general credo has been used in suspense for a long time (from Hitchcock films to Jaws). Finally, the recurring joke on Friends about the ugly naked guy who lives across the street (but whom we never see) clearly utilizes multimodal mental imagery.
5.2 Mental imagery in music
Mental imagery also plays a crucial role in our appreciation of music, primarily as a result of the importance of musical expectations, which are a form of auditory mental imagery. Expectations play a crucial role in our engagement with music (and art in general). When we are listening to a song, even when we hear it for the first time, we have some expectations of how it will continue. And when it is a tune we are familiar with, this expectation can be quite strong (and easy to study experimentally). When we hear Ta-Ta-Ta at the beginning of the first movement of Beethoven’s Fifth Symphony in C minor, Op. 67 (1808), we will strongly anticipate the closing Taaam of the Ta-Ta-Ta-Taaaam. Much of our expectations are fairly indeterminate: when we are listening to a musical piece we have never heard before, we will still have some expectations of how a tune will continue, but we don’t know what exactly will happen. We can rule out that the violin glissando will continue with the sounds of a beeping alarm clock (unless it’s a really avant-garde piece…), but we can’t predict with great certainty how exactly it will continue. Our expectations are malleable and dynamic: they change as we listen to the piece.
Expectations are mental states that are about how the musical piece will unfold. So they are future-directed mental states. But this leaves open just what kind of mental states they are – how they are structured, how they represent this upcoming future event and so on (see Judge and Nanay 2021 for an overview of the options and the history of this question). At least some forms of expectations in fact count as mental imagery. And musical expectations (of the kind involved in examples like the Ta-Ta-Ta-Taaaam) count as auditory temporal mental imagery: they are auditory representations that result from perceptual processes that are not directly triggered by the auditory input. The listener forms mental imagery of the fourth note (‘Taaam’) on the basis of the experience of the first three (‘Ta-Ta-Ta’) (there is a lot of empirical evidence that this is in fact what happens – see Yokosawa et al. 2013, Kraemer et al. 2005, Zatorre and Halpern 2005, Herholz et al. 2012, Leaver et al. 2009). This mental imagery may or may not be conscious. But if the actual ‘Taaaam’ diverges from the way our mental imagery represents it (if it is delayed, or altered in pitch or timbre for example), we notice this divergence and experience as salient as a result of the mismatch between the experience and the mental imagery that preceded it.
The Ta-Ta-Ta-Taaam example is a bit simplified, so here is a real-life and very evocative case study, an installation by the British artist, Katie Peterson. The installation is an empty room with a grand piano in it, which plays automatically. It plays a truncated version of Beethoven’s Moonlight Sonata. The title of the installation is ‘Earth-Moon-Earth (Moonlight Sonata Reflected From The Surface of The Moon’ (2007). Earth-Moon-Earth is a form of transmission (between two locations on Earth), where Morse codes are beamed up the moon and they are reflected back to earth. While this is an efficient way of communicating between two far-away (Earth-based) locations, some information is inevitably lost (mainly because some of the light does not get reflected back but it is absorbed in the Moon’s craters). In ‘Earth-Moon-Earth (Moonlight Sonata Reflected From The Surface of The Moon’ (2007) the piano plays the notes that did get through the Earth-Moon-Earth transmission system, which is most of the notes, but some notes are skipped. Listening to the music the piano plays in this installation, if you know the piece, your auditory mental imagery is constantly active, filling in the gaps where the notes are skipped.
5.3 Mental imagery in literature
Reading a novel tends to lead to mental imagery in a variety of sense modalities. This triggering of mental imagery is typically involuntary: you do not need to count to three and voluntarily conjure up the mental imagery of the protagonist’s face, instead, you have involuntary mental imagery episodes somewhat reminiscent of flashbacks. While this kind of mental imagery is often visual (when you have imagery of the protagonist’s face or the layout of the room where they are), it can also be auditory (of the protagonist’s tone of voice, for example), olfactory or even gustatory (see Starr 2013 for a wide-ranging analysis with an emphasis on multimodal mental imagery and Stokes 2019 for the role such mental imagery plays in reading fictional works). Further, the more vivid the reader’s mental imagery is, the more likely it is that information from the novel is imported into the reader’s beliefs about the real world (Green and Brock 2000).
At the end of the first book of In Search of Lost Time, Marcel Proust gives a brief but very sophisticated account of how words trigger mental imagery, which is also indicative of the way Proust himself manipulates the reader’s mental imagery (Proust 1913/1928). He makes a distinction between names and words and argues that names trigger a more specific or more determinate mental imagery than words. In both cases, the name or word leads to mental imagery, but then, in turn, mental imagery influences or colors the name or word when we encounter it next time. So throughout the unfolding of the novel, names/words and the mental imagery they occasion evolve in parallel, influencing each other.
Other writers also actively manipulate the reader’s mental imagery. George Orwell points out the importance of mental imagery in understanding metaphors when he says in Poetics and the English Language that “The sole aim of metaphor is to call up a visual image”. We might add to this that this imagery is often not visual, it can be auditory, olfactory, etc. And here is a final example from the third part of Roberto Bolano’s novel 2666 (‘The Part about Fate’). This part of the book introduces a New York-based journalist, Oscar Fate. After about 80 pages of description of Fate’s life in New York City, it is revealed that he is in fact African-American. This comes after very explicit nudges to form mental imagery of him as Caucasian, confronting the readers with their implicit racial bias (see also Section 4.4 above).
5.4 Mental imagery in conceptual art
While discussions of mental imagery crop up in most fields of aesthetics and art history (including by some of the most influential art historians, like George Kubler, see Kubler 1987), the role of mental imagery is probably the most salient if we turn to conceptual art. Many conceptual artworks actively try to engage our mental imagery in an unexpected manner. Here are two illustrative (and famous) examples, but the point can be generalized.
Marcel Duchamp’s L.H.O.O.Q. Rasée (1965) is a picture that is perceptually indistinguishable from a faithful reproduction of Leonardo’s Mona Lisa. But Duchamp earlier made another picture (L.H.O.O.Q.) where he drew a moustache and beard on the picture of Mona Lisa. Duchamp’s L.H.O.O.Q. Rasée (as ‘rasée’ means ‘shaven’) is a reference to this earlier picture and we, presumably, see it differently from the way we see Leonardo’s original: the missing moustache and beard is part of our experience, whereas it is not when we look at Leonardo’s original. And it is difficult to see how we can describe our experience of L.H.O.O.Q. Rasée without some reference to the mental imagery of the missing beard and moustache. What is interesting in this example is that the mental imagery of the beard and moustache is influenced in a top-down manner not just by our prior knowledge (about how the world is) but also by our prior art historical knowledge.
The second example is Robert Rauschenberg’s Erased de Kooning drawing (1953), which is just what it says it is: all we see is an empty paper (with hardly visible traces of the erased drawing on it). Again, it is difficult to look at this artwork without trying to discern what drawing might have been there before Rauschenberg erased it. And this involves trying to conjure up mental imagery of the original drawing. Again, these were two classic examples. But there are more. All of Ai Wei Wei’s works, for example, rely heavily on our mental imagery.
Not all works of conceptual art evokes mental imagery this way. One exception would be Robert Barry’s All the things I know, which is nothing but the following sentence written on the gallery wall with simple block letters: “All the things I know but of which I am not at the moment thinking – 1:36 PM; June 15, 1969”. It would be difficult to argue that this work has much interest in enticing the viewer’s mental imagery. But it is not easy to find an example of a conceptual artwork where mental imagery plays no role. So in the vast majority of conceptual artworks, mental imagery is a necessary feature of appreciating the artwork.