For the past two decades, my colleagues and I have studied the neural foundation for conceptual representations of common objects, actions, and their properties. This work has been guided by a framework that I have previously referred to as the “sensory–motor model” (Martin, 1998, Martin, Ungerleider, & Haxby, 2000), and that I will refer to here by the acronym GRAPES (standing for “grounding representations in action, perception, and emotion systems”). This framework is a variant of the sensory/functional model outlined by Warrington, Shallice, and colleagues in the mid-1980s (Warrington & McCarthy, 1987, Warrington & Shallice, 1984) that has dominated neuropsychological (e.g., Damasio, Tranel, Grabowski, Adolphs, & Damasio, 2004, Humphreys & Forde, 2001), cognitive (e.g., Cree & McRae, 2003), and computational (e.g., McClelland & Rogers, 2003, Plaut, 2002) models of concept representation (see also Allport, 1985).

In this article, I will describe the GRAPES model and discuss its implications for understanding how conceptual knowledge is organized in the human brain.

Preliminary concerns

For the present purposes, an object concept refers to the information an individual possesses that defines a basic-level object category (roughly equivalent to the name typically assigned to an object category, such as “dog,” “hammer,” or “apple”; see Rosch, Mervis, Gray, Johnson, & Boyes-Braem, 1976, for details). Concepts play a central role in cognition because they eliminate the need to rediscover or relearn an object’s properties with each encounter (Murphy, 2002). Identifying an object or a word as a “hammer” allows us to infer that this is an object that is typically made of a hard substance, grasped in one hand, used to pound nails—that it is, in fact, a tool—and so forth. It takes only brief reflection (or a glance at a dictionary) to realize that object information is not limited to perception-, action-, or emotion-related properties. We know, for example, that “dogs” like to play fetch, carpenters use “hammers,” and “apples” grow on trees. In fact, most of the information that we possess about objects is this type of associative or encyclopedic knowledge. This knowledge is typically expressed verbally, is unlimited (there is no intrinsic limit on how much information we can acquire), and is often idiosyncratic (e.g., some people know things about “dogs” that others do not). In contrast, another level of representation, often referred to as semantic or conceptual “primitives,” is accessed automatically, constrained in number, and universal to everyone who possesses the concept. Conceptual primitives are object-associated properties that underpin our ability to quickly and efficiently identify objects at the basic-category level (e.g., as a “dog,” “hammer,” or “apple”), regardless of the modality of presentation (visual, auditory, tactile, or internally generated) or the stimulus format (verbal, nonverbal). Conceptual primitives provide a scaffolding or foundation to support both the explicit retrieval of object-associated information (e.g., enabling us to answer “orange” when asked, What color are carrots?), as well as to gain access to information from our large stores of associative/encyclopedic object knowledge (e.g., allowing us to answer “rabbit” when asked, What animal likes to eat carrots?). For a more detailed discussion of this and related issues, see Martin (1998, 2007, 2009).

The GRAPES model

The central claim is that information about the salient properties of an object—such as what it looks like, how it moves, how it is used, as well as our affective response to it—is stored in our perception, action, and emotion systems. The use of the terminology “perception, action, and emotion systems” rather than “sensory-motor regions” is a deliberate attempt to guard against an unfortunate and unintended consequence of the sensory–motor terminology that has given some the mistaken impressions that concepts could be housed in primary sensory and motor cortices (e.g., V1, S1, A1, M1) and that an object concept could be stored in a single brain region (e.g., a “tool” region). Nothing could be further from the truth. As is described below, my position is, and has always been, that the regions where we store information about specific object-associated properties are located within (i.e., overlap with) perceptual and action systems, specifically excluding primary sensory–motor regions (Martin, Haxby, Lalonde, Wiggs, & Ungerleider, 1995; however, the role of primary sensory and motor regions in conceptual-processing tasks is an important issue that will be addressed below).

It is also assumed that this object property information is acquired and continually updated through innately specified learning mechanisms (for a discussion, see Caramazza & Shelton, 1998, Carey, 2009). These mechanisms allow for the acquisition and storage of object-associated properties—form, color, motion, and the like. Although the architecture and circuitry of the brain dictates where these learning mechanisms are located, they are not necessarily tied to a single modality of input (i.e., they are property-specific, not modality-specific). For example, a mechanism specialized for learning about object shape or form will typically work upon visual input because that is the modality through which object form information is commonly acquired. As a result, this mechanism will be located in the ventral occipitotemporal visual object-processing stream. However, as has been convincingly demonstrated by studies of typically developing (Amedi, Malach, Hendler, Peled, & Zohary, 2001, Amedi, Jacobson, Hendler, Malach, & Zohary, 2002) as well as congenitally blind (Amedi et al., 2007, Pietrini et al., 2004) individuals, this mechanism can work upon tactile input, as well. Thus, information about the physical shape or form of objects will be stored in the same place in both normally sighted and blind individuals (e.g., ventral occipitotemporal cortex), regardless of the modality through which that information was acquired (see also Mahon et al., 2009, Noppeney, Friston, & Price, 2003). Relatedly, this information can be accessed through multiple modalities as well (e.g., information about how dogs look is accessed automatically when we hear a bark, or when we read or hear the word “dog”; Tranel et al., 2003a).

There are two major consequences of this formulation. Firstly, from a cognitive standpoint, it provides a potential solution for the grounding problem: How do mental representations become connected to the things they refer to in the world (Harnad, 1990)? Within GRAPES and related frameworks, representations are grounded by virtue of their being situated within (i.e., partially overlapping with) the neural system that supports perceiving and interacting with our external and internal environments.

Secondly, from a neurobiological standpoint, it provides a strong, testable—and easily falsifiable—claim about the spatial organization of object information in the brain. Not only is object property information distributed across different locations, but also, these locations are highly predictable on the basis of our knowledge of the spatial organization of the perceptual, action, and affective processing systems. Conceptual information is not spread across the cortex in a seemingly random, arbitrary fashion (Huth, Nishimoto, Vu, & Gallant, 2012), but rather follows a systematic plan.

The representation of object-associated properties: The case of color

According to the GRAPES model, object property information is stored within specific processing streams, but downstream from primary sensory, and upstream from motor, cortices. The overwhelming majority of functional brain-imaging studies support this claim (Kiefer & Pulvermüller, 2012, Martin, 2009, Thompson-Schill, 2003). Here I will concentrate on a single property, color, to illustrate the main findings and points.

Early brain-imaging studies showed that retrieving the name of a color typically associated with an object (e.g., “yellow” in response to the word “pencil”), relative to retrieving a word denoting an object-associated action (e.g., “write” in response to the word “pencil”), elicited activity in a region of the fusiform gyrus in ventral temporal cortex anterior to the region in occipital cortex associated with perceiving colors (Martin et al., 1995; and see Chao & Martin, 1999, and Wiggs, Weisberg, & Martin, 1999, for similar findings). Converging evidence to support this claim has come from studies of color imagery generation in control subjects (Howard et al., 1998) and in color–word synthestes in response to heard words (Paulesu et al., 1995).

Importantly, these findings were also consistent with clinical studies documenting a double dissociation between patients with achromatopsia—acquired color blindness concurrent with a preserved ability to generate color imagery (commonly associated with lesions of the lingual gyrus in the occipital lobe; e.g., Shuren, Brott, Schefft, & Houston, 1996)—and color agnosia—impaired knowledge of object-associated colors concurrent with normal color vision (commonly associated with lesions of posterior ventral temporal cortex, although these lesions can also include occipital cortex; e.g., Miceli et al., 2001, Stasenko, Garcea, Dombovy, & Mahon, 2014).

We interpreted our findings as supporting a grounded-cognition view based on the fact that the region active when retrieving color information was anatomically close to the region previously identified as underpinning color perception, whereas retrieving object-associated action words yielded activity in lateral temporal areas close to the site known to support motion perception (see Martin et al., 1995, for details). However, these data could just as easily be construed as being consistent with “amodal” frameworks that maintain that conceptual information is autonomous or separate from sensory processing (e.g., Wilson & Foglia, 2011). The grounded-cognition position maintains that the neural substrates for conceptual, perceptual, and sensory processing are all part of a single, anatomically broad system supporting both perceiving and knowing about object-associated information. Thus, evidence in support of grounded cognition would require showing functional overlap between, for example, the neural systems supporting sensory/perceptual and conceptual processing of color.

In spite of the failure of early attempts to demonstrate such a link (Chao & Martin, 1999), investigations have yielded strong, converging evidence to support that claim. Beauchamp, Haxby, Jennings, and DeYoe (1999) showed that when color-selective cortex was mapped by having subjects passively view colored versus grayscale stimuli, as had typically been done in previous studies (e.g., Chao & Martin, 1999, McKeefry & Zeki, 1997, Zeki et al., 1991), neural activity was restricted to the occipital cortex. However, when color-selective cortex was mapped using a more demanding task requiring subjects to make subtle judgments about differences in hue (modeled after the classic Farnsworth–Munsell 100-Hue Test used in the clinical evaluation of color vision), activity extended downstream from occipital cortex to the fusiform gyrus in ventral posterior temporal cortex (Beauchamp et al., 1999). We replicated this finding and further observed that this downstream region of the fusiform gyrus was also active when subjects retrieved information about object-associated color using a verbal, property-verification task (Simmons et al., 2007). These data provided support for the best of both worlds: continued support for the double dissociation between color agnosia and achromotopsia (because the color perceptual task, but not the color conceptual task, activated occipital cortex), but now coupled with evidence consistent with grounded cognition (because both tasks activated the same downstream region of the fusiform gyrus) (Simmons et al., 2007; see Martin, 2009, and Stasenko et al., 2014, for discussions).

Additional supporting evidence has come from a completely different source—electrophysiological recording and stimulation of the human cortex. Recording from posterior brain regions prior to neurosurgery, Murphey, Yoshor, & Beauchamp (2008) identified a site in the fusiform gyrus that not only was color-responsive, but was preferentially tuned to viewing a particular blue-purple color. Moreover, when electrical stimulation was applied to that site, the patient reported vivid, blue-purple color imagery (see Murphey et al., 2008, for details). The location of this region corresponded closely to the region active in previous imaging studies of color information retrieval, and, as is illustrated in Fig. 1, corresponded remarkably well to the region active during both perceiving and retrieving color information in the Simmons et al. (2007) study.

Fig. 1
figure 1

Regions of ventral occipitotemporal cortex responsive to perceiving and knowing about color. (A) Ventral view of the right hemisphere of a single patient. The red dot shows the location of the electrode that responded most strongly to blue-purple color and that produced blue-purple visual imagery when stimulated (reprinted with permission; see Murphey et al., 2008, for details). (B) Ventral view of the left hemisphere from the group study on perceiving and knowing about color (Simmons et al., 2007). Regions active when distinguishing subtle differences in hue are shown in yellow. The black circle indicates the approximate location of the lingual gyrus region active when passively viewing colors. The region responding to both perceiving and retrieving information about color is shown in red. Note the close correspondence between that region and the location of the electrode in panel A

Thus, in support of the grounded-cognition framework, these data indicate that the processing system supporting color perception includes both lower-level regions that mediate the conscious perception—or more appropriately, the “sensation” of color—and higher-order regions that mediate both perceiving and storing color information. Moreover, as will be discussed below, these posterior and anterior regions are in a dynamic, interactive state to support contextual, task-dependent demands.

The effect of context 1: Conceptual task demands influence responses in primary sensory (color) cortex

The Simmons et al. (2007) study using the modified version of the Farnsworth–Munsell 100-Hue Test demonstrated that increasing perceptual processing demands resulted in activity that extended downstream from low-level into higher-order color-processing regions. Thompson-Schill and colleagues have provided evidence that the reverse effect also holds (Hsu, Frankland, & Thompson-Schill, 2012); that is, increasing conceptual-processing demands can produce activity that feeds back upstream into early, primary-processing areas in order to solve the task at hand. These investigators also used the modified Farnsworth–Munsell 100-Hue Test to map color-responsive cortex. However, in contrast to the property verification task used by Simmons and colleagues, which required a “yes/no” response to such probes as “eggplant–purple” (Simmons et al., 2007), the study by Hsu et al. (2012) used a conceptual-processing task requiring subjects to make subtle distinctions in hue, thereby more closely matching the demands of the color perception task (e.g., which object is “lighter”? lemon, basketball; see Hsu et al., 2012, for details). Under these conditions, both the color perception and color knowledge tasks yielded overlapping activity in a region of the lingual gyrus in occipital cortex associated with the sensory processing of color. Moreover, this effect seems to be tied to similarity in the demands of the perceptual and conceptual tasks, since previous work by these investigators had shown that simply making the conceptual task more attention-demanding increased activity in the fusiform, by not the lingual, gyri (Hsu, Kraemer, Oliver, Schlichting, & Thompson-Schill, 2011). These findings suggest that, in order to meet specific task demands, higher-level regions in the fusiform gyrus that store information about object-associated color can reactivate early, lower-level areas in occipital cortex that underpin the sensory processing of color (and see Amsel, Urbach, & Kutas, 2014, for more evidence for the tight linkage between low-level perceptual and high-level conceptual processes in the domain of color).

As will be discussed next, low-level sensory regions can also show effects of conceptual processing when the modulating influence arises from the demands of our internal, rather than the external, environment.

The effect of context 2: The body’s homeostatic state influences responses in primary sensory (gustatory) cortex to pictures of appetizing food

A number of functional brain-imaging studies have shown that identifying pictures of appetizing foods activates a site located in the anterior portion of the insula (as well as other brain areas, such as orbitofrontal cortex; e.g., Killgore et al., 2003, Simmons, Martin, & Barsalou, 2005; see van der Laan, de Ridder, Viergever, & Smeets, 2011, for a review). Because the human gustatory system sends inputs to the insula, we interpreted this activity as reflecting inferences about taste generated automatically when viewing food pictures (Simmons et al., 2005). We have now obtained direct evidence in support of this proposal by mapping neural activity associated with a pleasant taste (apple juice, relative to a neutral liquid solution) and inferred taste (images of appetizing foods, relative to nonfood pictures) (Simmons et al., 2013). Juice delivery elicited activity in primary gustatory cortex, located in the mid-dorsal region of the insula (Small, 2010), as well as in the more anterior region of the insula identified in the previous studies of appetizing food picture identification (representing inferred taste). Viewing pictures of appetizing foods yielded activity in the anterior, but not mid-dorsal, insula. Thus, these results followed the same pattern as our study on perceiving and knowing about color (Simmons et al., 2007). Whereas gustatory processing activated both primary (mid-dorsal insula) and more anterior insula sites, higher-order representations associated with viewing pictures of food were limited to the more anterior region of insular cortex.

However, a unique feature of our study was that, because it was part of a larger investigation of dietary habits, we were able to acquire data on our subjects’ metabolic states immediately prior to the scanning session. Analyses of those data revealed that the amount of glucose circulating in peripheral blood was negatively correlated with the neural response to food pictures in the mid-dorsal, primary gustatory region of the insula—the lower the glucose level, the stronger the insula response. This unexpected finding indicated that bodily input could modulate the brain’s response to visual images of one category of objects (appetizing foods) but not others (nonfood objects; see Simmons et al., 2013, for details). When the body’s energy resources are low (as indexed by low glucose levels), pictures of appetizing foods become more likely to activate primary gustatory cortex, perhaps as a signal to act (i.e., to obtain food; more on this later). Moreover, this modulatory effect of glucose on the neural response to food pictures occurred in primary gustatory cortex—an area, like primary color-responsive cortex in the occipital lobe, assumed not to be involved in processing higher-order information (Fig. 2).

Fig. 2
figure 2

Regions of insular cortex responsive to perceived and inferred taste: Sagittal view of the left hemisphere showing regions in the insular cortex responsive to a pleasant taste (green) and viewing pictures of appetizing foods (blue). The histogram shows activation levels for food and nonfood objects in the anterior insula responsive to taste (red area). The graph shows the level of each subject’s response in primary gustatory cortex (mid-dorsal insula, green) as a function of peripheral blood glucose level. The correlation between glucose and the mid-dorsal insula response was significant (r = –.51) and significantly stronger than the response in this region to nonfood objects (r = –.04; see Simmons et al., 2013, for details)

Overall, these findings suggest a dynamic, interactive relationship between lower-level sensory and higher-order conceptual processing components of perceptual processing streams. Activity elicited in higher-order processing areas (fusiform gyrus for color, anterior insula for taste) may reflect the retrieval of properties associated with stable conceptual representations (invariant representations needed for understanding and communicating). In contrast, feedback from these regions to primary, low-level sensory processing areas may reflect contextual effects as a function of specific task requirements (as in the case of color) or bodily states (as in the case of taste). Neural activity elicited during conceptual processing is determined by both the content of the information retrieved and the demands of our external and internal environments.

What does overlapping activity mean?

The goal of these studies was to determine whether the neural activity selectively associated with retrieving object property information overlapped with the activity identified (independently localized) by a sensory or motor task. This approach has been used successfully multiple times. Some recent examples include showing that reading about motion activates motion-processing regions (Deen & McCarthy, 2010, Saygin, McCullough, Alac, & Emmorey, 2010), viewing pictures of graspable objects activates somatosensory cortex (Smith & Goodale, 2015), and viewing pictures of sound-implying objects (musical instruments, animals) activates auditory cortex (using an anatomical rather than a functional localizer; Meyer et al., 2010). The implication of these findings is that sensory/perceptual and conceptual processes are tightly linked. Demonstrating that retrieving information about color shows partial overlap with regions active when processing color licenses conclusions about where object property information is stored in the brain. This information is stored right in the processing system active when that information was acquired and updated. The alternative would be, for example, that we learn about the association between a particular object and its color in one place and then ship that information off to a different location for storage. The neuroimaging data provide clear evidence against that scenario.

The fact that overlapping activity is associated with perceptual and conceptual task performance does not mean, however, that the representations underpinning these processes—or their neural substrates—are identical. In fact, although functional brain-imaging data cannot address this issue,Footnote 1 it is highly likely that the representations are substantially different. Perceiving, imagining, and knowing, after all, are very different things, and so must be their neural instantiations. Even at the level of the neural column, bottom-up and top-down inputs show distinct patterns of laminar connectivity (e.g., Felleman & van Essen, 1991, Foxworthy, Clemo, & Meredith, 2013) and rely on different oscillatory frequencies (Buffalo, Fries, Landman, Buschman, & Desimone, 2011, van Kerkoerle et al., 2014). Nevertheless, the fact that perceptual and conceptual representations differ leaves open the possibility that their formats are the same.

Content and format

There seems to be strong, if not unanimous, agreement about the content and relative location in the brain of perception- and action-related object property information. Hotly debated, however, is the functional significance of this information (see below), and, most especially, the format of this information. Is conceptual information stored in a highly abstract, “amodal,” language-like propositional format? Or, is it stored in a depictive, iconic, picture-like format? The chief claim of many advocates of embodied and/or grounded cognition is that object and action concepts are represented exclusively in depictive, modality-specific formats (e.g., Barsalou, 1999, Glenberg & Gallese, 2012, Zwaan & Taylor, 2006; and see Carey, 2009, for a discussion from a nonembodied perspective of why the representational format of all of “core cognition” is likely to be iconic). Others have argued forcibly that the representations are abstract, amodal, and disembodied (although necessarily interactive with sensory–motor information; see, e.g., the “grounding by interaction” hypothesis proposed by Mahon & Caramazza, 2008).

The importance of the distinction between the content and format of mental representations was raised by Caramazza and colleagues (Caramazza, Hillis, Rapp, & Romani, 1990) in their argument against the “multiple, modality-specific semantics hypothesis” advocated by Shallice (Shallice, 1988; and see Shallice, 1993, for his reply, and Mahon, 2015, for more on the format argument). Prior to that the issue of format was, and it continues to be, the central focus of the lengthy debate regarding whether the format of mental imagery is propositional (e.g., Pylyshyn, 2003) or depictive (e.g., Kosslyn, Thompson, & Ganis, 2006).

The problem, however, is that we do not know how to determine the format of a representation (if we did, we would not still be debating the issue). And, knowing where in the brain information is stored, and/or what regions are active when that information is retrieved, offers no help at all. Even in the earliest, lowest-level regions of the visual-processing stream, the format could be depictive on the way up, and propositional on the way back down. What we do know is that at the biological level of description, mental representations are in the format of the neural code. No one knows what that is, and no one knows how it maps onto the cognitive descriptions of representational formats (i.e., amodal, propositional, depictive, iconic, and the like), nor even if those descriptions are appropriate for such mapping. What is missing from this debate is agreed-upon procedures for determining the format of a representation. Until then, the format question will remain moot. It has no practical significance.

Object property information is integrated within category-specific neural circuits: The case of “tools”

Functional brain imaging has provided a major advance in our thinking about how the brain responds to the environment by showing that viewing objects triggers a cascade of activity in multiple brain regions that, in turn, represent properties associated with that category of objects. Viewing faces, for example, elicits activity that extends beyond the fusiform face area to regions associated with perceiving biological motion (the posterior region of the superior temporal sulcus) and affect (the amygdala), even when the face images are static and posed with neutral expressions (Haxby, Hoffman, & Gobbini, 2000). Similarly, viewing images of common tools (objects with a strong link between how they are manipulated and their function; Mahon et al., 2007) elicits activity that extends beyond the ventral object-processing stream to include left hemisphere regions associated with object motion (posterior middle temporal gyrus) and manipulation (intraparietal sulcus, ventral premotor cortex) (Beauchamp, Lee, Haxby, & Martin, 2002, 2003, Chao & Martin, 2000, Grafton, Fadiga, Arbib, & Rizzolatti, 1997, Kellenbach, Brett, & Patterson, 2003, Mahon et al., 2007, Mahon, Kumar, & Almeida, 2013; and see Chouinard & Goodale, 2010, for a review).

Thus, specific object categories are associated with unique networks or circuits composed of brain regions that code for different object properties. There are several important points to note about these circuits. Firstly, they reflect some, but certainly not all, of the properties associated with a particular category. Tools and animals, for example, have distinctive sounds (hammers bang, lions roar), yet the auditory system is not automatically engaged when viewing or naming tools or animals. Certain properties are more salient than others for representing a category of objects—a result that agrees well with behavioral data (e.g., Cree & McRae, 2003). Secondly, the regions comprising a circuit do not come online in piecemeal fashion as they are required to perform a specific task, but rather seem to respond in an automatic, all-or-none fashion, as if they were part of the intrinsic, functional neural architecture of the brain. Indeed, studies of spontaneous, slowly fluctuating neural activity recorded when subjects are not engaged in performing a task (i.e., task-independent or resting-state functional imaging) strongly support this possibility. These studies have shown that during the so-called resting state, there is strong covariation among the neural signals spontaneously generated from each of the regions active when viewing and identifying certain object categories, including faces (O’Neil, Hutchison, McLean, & Köhler, 2014, Turk-Browne, Norman-Haignere, & McCarthy, 2010), scenes (Baldassano, Beck, & Fei-Fei, 2013, Stevens, Buckner, & Schacter, 2010), and tools (Hutchison, Culham, Everling, Flanagan, & Gallivan, 2014, Simmons & Martin, 2012, Stevens, Tessler, Peng, & Martin, 2015; see Fig. 3). Certain object categories are associated with activity in a specific network of brain regions, and these regions are in constant communication, over and above the current task requirements.

Fig. 3
figure 3

Intrinsic circuitry for perceiving and knowing about “tools.” (A) Task-dependent activations: Sagittal view of the left hemisphere showing regions in posterior middle temporal gyrus, posterior parietal cortex, and premotor cortex that are more active when viewing tools than when viewing animals (blue regions, N = 34) (Stevens et al., in press). (B) Task-independent data: Covariation of slowly fluctuating neural activity recorded at “rest” in a single subject (blue regions). Seeds were in the medial region of the left fusiform gyrus and in the right lateral fusiform gyrus (not shown), identified by the comparison of tools versus animals, respectively (independent localizer). Resting-state time series in the color regions were significantly more correlated with fluctuations in the left medial fusiform gyrus than with those in the right lateral fusiform gyrus (for details, see Stevens et al., in press). (C) Covariation of slowly fluctuating neural activity recorded at “rest” in a group study (blue regions, N = 25). Seeds were in the left posterior middle temporal gyrus and the right posterior superior temporal sulcus, identified by independent localizer scans (see Simmons & Martin, 2012, for details)

Although the function of this slowly fluctuating, spontaneous activity remains largely unknown, one possibility is that it allows information about different properties to be shared across regions of the network. If so, then each region may act as a convergence zone (Damasio, 1989, Simmons & Barsalou, 2003) or “hub,” representing its primary property and, to a lesser extent, the properties of one or more of the other regions in the circuit—depending, perhaps, on its spatial relation to the other regions in the circuit (Power, Schlaggar, Lessov-Schlaggar, & Petersen, 2013). The more centrally located a region, the more hub-like its function. This seems to be the case for tools, for which a lesion of the most centrally located component of its circuitry, the posterior region of the left middle temporal gyrus, produces a category-specific knowledge deficit for tools and their associated actions (Brambati et al., 2006, Campanella, D’Agostini, Skrap, & Shallice, 2010, Mahon et al., 2007, Tranel, Damasio, & Damasio, 1997, Tranel, Manzel, Asp, & Kemmerer, 2008).Footnote 2

According to this view, information about a property is not strictly localized to a single region (as is suggested by the overlap approach), but rather is a manifestation of local computations performed in that region as well as a property of the circuit as a whole (cf. Behrmann & Plaut, 2013). Moreover, regions vary in their global connectivity, or “hubness” (i.e., the extent to which a region is interconnected with other brain regions) (see Buckner et al., 2009, Cole, Pathak, & Schneider, 2010, Gotts et al., 2012; and Power, Schlaggar, Lessov-Schlaggar, & Petersen, 2013, for approaches and data on the brain’s hub structure).

An advantage of this view is that it provides a framework for understanding how a lesion to a particular region or node of a circuit can sometimes produce a deficit for retrieving one type of category-related information, but not others, whereas other lesions seem to produce a true category-specific disorder characterized by a failure to retrieve all types of information about a particular category (Capitani, Laiacona, Mahon, & Caramazza, 2003). For example, in the domain of tools, some apraxic patients with damage to left posterior parietal cortex can no longer demonstrate an object’s use, but can still name it, whereas other patients with damage to left middle temporal gyrus seem to have more general losses of knowledge about tools and their actions (e.g., Tranel, Kemmerer, Adolphs, Damasio, & Damasio, 2003b), presumably as a result of disrupted connectivity or functional diaschisis (He et al., 2007, Price, Warburton, Moore, Frackowiak, & Friston, 2001; see Carrera & Tononi, 2014, for a recent review).

Once we accept that different forms of knowledge about a single object category (e.g., tools) can be dissociated, we are left with an additional puzzle. The neuropsychological evidence clearly shows that damage to left posterior parietal cortex can result in an inability to correctly use an object, without affecting the ability to visually recognize and name that object (Johnson-Frey, 2004, Negri et al., 2007, Rothi, Ochipa, & Heilman, 1991). If so, then why is parietal cortex active when subjects simply view and/or name tools? What is the functional role of that activity? One possibility is that this parietal activity does not reflect any function at all. Rather, it is simply due to activity that automatically propagates from other parts of the circuit necessary to perform the task at hand. Naming tools requires activity in temporal cortex. Thus, regions in posterior parietal cortex may become active merely as a by-product of temporal–parietal lobe connectivity; that activity might have no functional significance. Although this theory is logically possible, I do not think it is a serious contender. It takes a lot of metabolic energy to run a brain, and I doubt that systems have evolved to waste it (Raichle, 2006). Neural activity is never epiphenomenal; it always reflects some function, even though that function may not be readily apparent.

I think that there are two, non-mutually-exclusive purposes behind activity in the dorsal processing stream when naming tools. One possibility is that this activation is, in fact, part of the “full” representation of the concept of a tool (Mahon & Caramazza, 2008). Under that view, perception- and action-related properties are both constitutive, essential components of the full concept of a particular tool. Removal of one of these components—for example, action-related information—as a consequence of brain injury or disease would result in an impoverished concept. The concept of that tool would nevertheless remain grounded, but now by perceptual systems alone (for a different interpretation, see Mahon & Caramazza, 2008).

Another possibility is that parietal activity reflects the spread of activity to a function that typically occurs following object identification. For example, I have previously argued that the hippocampus is active when we name objects not because it is necessary to name them (it is not), but rather because it is necessary to be able to recall having named them (Martin, 1999). In a similar fashion, and consistent with the well-established role of the dorsal stream in action representation (Goodale & Milner, 1992), parietal as well as premotor activity associated with viewing tools might reflect a prediction or prime for future action (Martin, 2009, Simmons & Martin, 2012). Experience has taught us that seeing some objects is followed by an action. Activating the dorsal stream when viewing a tool may be a prime to use it. Activating the insula when viewing an appetizing food may be a prime to eat it—a phenomenon that the advertising industry has long been aware of.

Concluding comment

The GRAPES model provides a framework for understanding how information about object-associated properties is organized in the brain. A central advance of this model over previous formulations is a deeper recognition and understanding of the role played by the brain’s large-scale, intrinsic circuitry in providing dynamic links between regions representing the salient properties associated with specific object categories.

Many of these properties are situated within the ventral and dorsal processing streams that play a fundamental role in object and action representation. An ever-increasing body of data from monkey neuroanatomy and neurophysiology, and from human neuroimaging, is providing a more detailed understanding of this circuitry. One major implication of these findings is that the notion of serial, hierarchically organized processing streams is no longer tenable. Instead, these large-scale systems are best characterized by discrete, yet highly interactive circuits, which, in turn, are composed of multiple, recurrent feedforward and feedback loops (see Kravitz, Saleem, Baker, & Mishkin, 2011, Kravitz, Saleem, Baker, Ungerleider, & Mishkin, 2013, for a detailed overview and compelling synthesis of these findings). This type of architecture is assumed to characterize the category-specific circuits discussed here and to underpin the dynamic interaction between higher-order conceptual, perceptual, and lower-order sensory regions in the service of specific task and bodily demands.

The emphasis on grounded circuitry may also inform our understanding of how abstract concepts are organized. Imaging studies of social and social–emotional concepts (such as “brave,” honor, “generous,” “impolite,” and “convince”) have consistently implicated the most anterior extent of the superior temporal gyrus/sulcus (STG/STS; Simmons, Reddish, Bellgowan, & Martin, 2010, Wilson-Mendenhall, Simmons, Martin, & Barsalou, 2013, Zahn et al., 2007; for reviews, see Olson, McCoy, Klobusicky, & Ross, 2013, Simmons & Martin, 2009, Wong & Gallate, 2012), as well as medial—especially ventromedial—prefrontal cortex (Mitchell, Heatherton, & Macrae, 2002, Roy, Shohamy, & Wager, 2012, Wilson-Mendenhall et al., 2013). One might think that any conceptual information represented in these very anterior brain regions would be disconnected from, rather than grounded in, action and perceptual systems. Yet the circuitry connecting these regions with other areas of the brain suggests otherwise.

Tract-tracing studies of the macaque brain (Saleem, Kondo, & Price, 2008) and task-based (Burnett & Blakemore, 2009) and resting-state (Gotts et al., 2012, Simmons & Martin, 2012, Simmons et al., 2010) functional connectivity studies of the human brain have revealed strong connectivity between these anterior temporal and prefrontal regions. For example, anterior STG/STS, but not anterior ventral temporal cortex, is strongly connected to medial prefrontal cortex (Saleem et al., 2008). In addition, human functional-imaging studies have shown that both of these regions are part of a broader circuit implicated in multiple aspects of social functioning in typically developing individuals (for reviews, see Adolphs, 2009, Frith & Frith, 2007) and in social dysfunction in autistic subjects (e.g., Ameis & Catani, 2015, Gotts et al., 2012, Libero et al., 2014, Uddin et al., 2011, Wallace et al., 2010).

So, how are these social and social–emotional concepts grounded? They are grounded by virtue of being situated within circuitry that includes regions for perceiving and representing biological form (lateral region of the fusiform gyrus) and biological motion (posterior STS) and for recognizing emotion (the amygdala) (Burnett & Blakemore, 2009, Gotts et al., 2012, Simmons & Martin, 2012, Simmons et al., 2010). Clearly, much work remains to be done in uncovering the roles of the anterior temporal and frontal cortices in representing our social world. Nevertheless, these data provide an example of how even abstract concepts may be grounded in our action, perception, and emotion systems.