ABSTRACT
There is growing research interest in the neural mechanisms underlying the recognition of material categories and properties. This research field, however, is relatively recent and limited compared to investigations of the neural mechanisms underlying object and scene category recognition. Motion is particularly important for the perception of non-rigid materials, but the neural basis of non-rigid material motion remains unexplored. Using fMRI we investigated whether brain regions respond differentially to material motion versus other motions. Stimuli were dynamic dot animations that induce vivid percepts of various materials in motion, e.g. flapping cloth, liquid waves, wobbling jelly. Control stimuli were scrambled motion and rigid three-dimensional rotating dots. We used a block design and the general linear model to contrast conditions (whole brain analyses). Results showed that isolating material motion properties with dynamic dots (in contrast with other kinds of motion) activates a network of activity in both ventral and dorsal visual pathways, including areas normally associated with the processing of surface properties and shape, and extending to somatosensory and premotor cortices. We suggest that such a widespread preference for material motion is due to strong associations between stimulus properties: when you see the dots move in a specific pattern not only do you see material motion – you see a flexible, non-rigid shape, identify the object as a cloth flapping in the wind, get a clear sense of its weight under gravity, and feel as though you could reach out and touch it. These results are a first important step in mapping out the cortical architecture and dynamics in material related motion processing.
INTRODUCTION
Recognizing and estimating the material qualities of objects is an essential part of our visual experience. Perceiving material qualities quickly and correctly is critical for guiding decisions or actions, whether we are deciding what fruit to eat, whether the blanket is soft enough, or how we should grip a porcelain cup. Despite its importance, it is still not well understood how the brain accomplishes the task of recognising materials and estimating their properties. Most neuroscientific studies about material perception have focused on the cortical areas involved in the visual processing of material properties (see Komatsu & Goda, 2018 for a review). This focus on visual processing is in part because visual neuroscience is a long-standing field that provides a great array of established methods and tools (Werner & Chalupa, 2013) that can be applied to novel questions like how the brain accomplishes material perception. As a consequence, nearly all neuroimaging and neurophysiological studies investigating material perception have used static images as stimuli (Komatsu & Goda, 2018). This choice is somewhat justified because, although material perception is a multisensory experience, many material qualities can be conveyed through images alone.
Material properties that can be directly conveyed through visual information are so called optical properties. These properties, such as a surface’s micro- and meso-structure, or its reflectance-transmission and refractive properties, give a material its characteristic visual appearance, (e.g. glossy, plastic-y, metallic, etc.). The neuroimaging literature has heavily focused on how the brain discriminates between different optical appearances (for a review see Komatsu & Goda, 2018). Non-optical properties, however, can also be conveyed through images: we can experience the ‘feel’ of soft silk, just by looking at an image of it owing to previously formed associations between the different senses when interacting with materials; over time, specific visual (auditory, proprioceptive, olfactory) information becomes associated with specific tactile information and vice versa. It is possible that via this indirect route (Fleming, 2014) mechanical and tactile material qualities like softness, viscosity, and roughness, can be conveyed through optical properties of surfaces and the 3D structure of visual objects (e.g. Baumgartner et al. 2013; Schmidt et al. 2017; Xiao et al. 2016; Ho et al. 2006; Giesel & Zaidi, 2013; van Assen & Fleming 2016; Fleming et al. 2013).
But not all material properties can be faithfully conveyed through static images. In fact, it has been shown that image motion provides information about material qualities over and above the information available from 2D images (Doerschner et al. 2011, Schmid & Doerschner 2018; Paulun et al. 2017). For example, most mechanical material properties become much more vividly apparent with motion: watching a rubber band stretch, a jelly jitter, and hair bending elicits strong haptic percepts of elasticity, wobbliness, and softness, respectively. It is therefore quite surprising that only very few studies have investigated the neural mechanisms involved in material perception using movies as stimuli (but see Okazawa et al. 2012; Kam et al. 2015, Sun et al. 2016a), and no neuroimaging studies exist that examine the cortical processing of mechanical properties of nonrigid materials. One possible reason for this is that dynamic stimuli are complex and it is difficult to find adequate control conditions for such stimuli. Recent improvements in computing power and an increased ability to simulate such complex materials with computer graphics now puts us in a position to tackle this problem.
In this study we developed a new class of stimuli for investigating the neural correlates of material perception, which utilizes the fact that mechanical and tactile qualities of materials can be convincingly conveyed through image motion alone (in Schmid & Doerschner, 2018; Bi et al., 2019). Previously these have been called point light stimuli by Schmid & Doerschner (2018), analogous to ‘point light walkers’ that have been used extensively in biological motion research (Johanson, 1973). Similar stimuli have also been called ‘dynamic dot stimuli’ (Bi et al., 2019). Here, we name our new movie database ‘dynamic dot materials’, where specific nonrigid material types are solely depicted through the motion of black dots on a gray background. Research in biological motion (e.g. Servos et al., 2012) and structure from motion (SfM, e.g. Orban et al. 1999, Preuskens et al., 2004) suggests that the brain can be very sensitive to certain structured motion. We wanted to investigate whether cortical areas exist that show a preference for non-rigid material motion over other types of motion.
We envisaged that dynamic dot materials do not only depict mechanical material properties convincingly, but they also have several advantages over the still images used in previous work. They:
isolate dynamic properties from optical cues;
capture non-optical aspects of material qualities, in particular mechanical properties;
provide more stimulus control compared to ‘full cue’ videos - they can be motion-scrambled and the behavior of trajectories and speed of individual dots can be manipulated;
allow us to investigate whether areas previously associated with materials (e.g. CoS, e.g. Cant & Goodale 2007; Cant & Xu, 2012, 2015, 2016; Gallivan et al. 2014; Eck et al., 2016; Kitada et al. 2014) are indeed specialized for processing visual (optical) properties or whether they represent material properties more generally;
allow us to investigate whether areas associated with coherent motion preference show a preference for specific types of coherent motion (non-rigid vs. rigid)
Using fMRI, we investigated whether brain regions respond differentially to these novel dynamic dot materials versus other types of motions and motion scrambled control stimuli. Anticipating our results, we find widespread preferential activation for non-rigid material motion. This suggests that dynamic dot materials are very suitable for mapping the cortical network involved in the perception of material qualities.
MATERIALS & METHODS
Participants
10 volunteers participated in the experiment (age range: 21-42, 2 males, mean age: 27.3, 1 left handed). All participants had normal or corrected to normal vision, and had no known neurological disorders. Participants gave their written consent prior to the MR imaging session. Protocols and procedures were approved by the local ethics committee of Justus Liebig University Giessen and in accordance with the Code of Ethics of the World Medical Association (Deklaration of Helsinki).
Stimuli
Dynamic dot materials
Stimuli were non-rigid materials, generated with Blender (version 2.7) and Matlab (release 2012a; MathWorks, Natick, MA) and presented using Matlab and the Psychophysics Toolbox (Brainard, 1997). In general, each dynamic dot material was an animation that lasted 2s (48 frames) and that depicted the deformation of a specific material under force. Object deformations were simulated using the Particles System physics engines in Blender, with either the fluid dynamics or Molecular addon (for technical specifications we refer the reader to Schmid & Doerschner, 2018). In order to have a variety of non rigid stimuli we created several materials with different mechanical properties including cloth, liquids, breaking materials of high, medium and low elasticity, as well as non breaking elastic materials. In brief, the simulations were created as follows: cloth-like materials (cloth & cloth_rot) a sheet of linked particles was attached at the top of the scene and blown by a wind force field. The second cloth animation was obtained by simply rotating the original simulation 90 degrees. There were two animations showing liquids. One animation consisted of a liquid, simulated with fluid particles that had been dropped into a small round container, causing it to ripple. In the second liquid animation fluid particles had been dropped into a large square container and stirred by an invisible rod causing larger waves. One elastic cube was made of linked unbreaking particles attached to the ground, and poked with an invisible rod (pokeWobble). The same animation was also rendered from a different camera angle (pokeWobble_rot). Another type of elastic cube made of linked, unbreaking particles was attached to invisible solid walls that moved horizontally appart, causing it to wobble (stretchBounce). There were three types of elastic cube that consisted of linked, breakable particles, and in all of these animations particles were attached to invisible solid walls that moved apart, causing them to rip and wobble to various degrees: in stretchWobble the material wobbled like a hard jelly. StretchHighWobble was the same as stretchWobble but with an even larger wobbling motion, and stretchDough_rot showed a low-elasticity material that ripped apart softly with no wobble motion. This last movie was also rotated by 90 degrees to provide some variation to the other three “stretch” movies. This yielded in total 10 animations, showing non-rigid materials under various forces and in different scenes.
Upon creation of these different material types we exported the 3D coordinates of each particle at each frame. Using Matlab, we calculated the 2D projection for each particle from a specific viewing angle. For each material only 200 random particles were selected for display, as this medium density yielded best perceptual impressions in previous work (Schmid & Doerschner, 2018), with the remaining particles set to invisible. The particles were sampled from throughout the volume of the substance with two exceptions: for cloth and liquids particles were sampled from the surface of the substance, because sampling from the volume in these cases did not convey the material qualities of liquids convincingly.
Scrambled control stimuli
In order to create control stimuli that were matched in motion energy (measured as average velocity magnitude, see Supplementary Figure 1), we shuffled the position of each particle on the first frame. All dots had the same velocity magnitude as the material motion; however, the direction of trajectory was rotated by a random (uniform distribution) amount between 0 and 2pi every 2nd frame. Without this rotation the scrambled stimuli would still look like non-rigid materials. The trajectory of a given dot was forced to remain inside the spatial range of the dynamic dot materials (in the 1st frame) by forcing it to change its direction if it is outside the boundary. This was accomplished by rotating the dot trajectory to a different direction. As a consequence acceleration between material and control stimuli were not matched (See Supplementary Figure 2). Example scrambled control stimuli are shown in Figure 1C.
During the functional scans participants were presented with alternating blocks of dynamic dots stimuli, motion scrambled control stimuli and structure from motion (SfM) control stimuli. On the right side of panel A we show selected frames of 4 of the 10 possible dynamic dot material animations that were shown in random order in a block. On the left side of the same panel the timing of presentation during a dynamic dot material block is shown. Panel B depicts an example run, and panel C shows selected frames of 1 of the 10 possible random motion control animations (left) and the SfM control stimuli (right). The timing during a block of these control conditions was identical to that of dynamic dot material blocks.
Structure from Motion control stimuli
In order to generate these control stimuli we selected one frame of each material motion movie and then rotated the camera back and forth around the center of the scene. This gave the object an appearance of rotating in depth around the horizontal (or vertical axis). SfM control stimuli are shown in Figure 1C.
For each stimulus, coordinates, movies, and individual movie frames can be found in the database provided with the Supplementary Material (https://www.dropbox.com/sh/nakqpn022lpaptn/AACQkbsgFuVvUIZlRXnoWXrya?dl=0).
Stimulus display
Visual stimuli were presented on an MR-safe LCD screen placed near the rear end of the scanner bore (Cambridge Research Systems Ltd, Rochester, UK; resolution: 1920 by 1080; refresh rate 120 Hz). Participants viewed the screen through an angled mirror attached to the head-coil while lying supine inside the scanner bore. Total eye-to-screen optical distance was 140 cm, and the screen subtended a visual angle of 28 degrees horizontally at this distance. Stimuli were presented at the center of the screen and approximately subtended 15.75 by 15.75 degrees visual angle.
Fixation task
To ensure fixation during the scanning session, aid maintaining vigilance and wakefulness, and limit attentional effects, we asked participants to perform a demanding fixation task throughout the functional runs. In this task, participants were required to report a brief reduction in the size of the fixation cross via a button press. The cross shrank by a small amount every 3 seconds with a random (+/− up to 1.5 seconds) time jitter, in order to make it unpredictable when the shrinking would occur.
MR Image Acquisition
Magnetic resonance images were collected on a 3 Tesla MRI scanner (Magnetom Prisma, Siemens AG, Erlangen, Germany) equipped with a 64-channel head coil in BION imaging center of JLU Giessen. MR sessions contained a structural run and 4-8 functional runs. Structural images were acquired using a T1-weighted 3-D anatomical sequence (sagittal MP-RAGE, Spatial resolution: 1 mm3 isotropic; number of slices: 176). Functional images were acquired while participants viewed the visual stimuli and were acquired with a T2*-weighted gradient-recalled echo-planar imaging (EPI) sequence (TR: 2000 ms; TE: 30 ms; spatial resolution: 3×3×3 mm3; number of slices: 36; slice orientation: parallel to calcarine sulcus). Each participant took part in one scanning session that lasted about an hour and a half. During functional runs participant responses were collected using an MR-safe button box.
Experimental Design
During the functional runs, different types of motion stimuli (material, scrambled control, SfM control) were presented in alternating blocks. The order of blocks was randomized and counter-balanced. Each block lasted 22 seconds, and contained ten short clips (2 s) of animation separated by 200 ms interstimulus interval (ISI). Figure 1B depicts the experimental protocol. There were 3 repeats of each stimulus type (material, scrambled control, SfM control) per run, 4-8 runs in a session, thus 12-24 repeats of each stimulus per participant. One run lasted 286 seconds.
MR Data Analysis
All MR image preprocessing and further analyses were performed using BrainVoyager QX, except an initial inhomogeneity correction step on T1-weighted images, which was conducted using Freesurfer4 software (http://surfer.nmr.mgh.harvard.edu/). After the initial inhomogeneity correction with Freesurfer4, anatomical images were imported into BrainVoyager for further preprocessing. Preprocessing for the anatomical images included the following steps: another inhomogeneity correction using BrainVoyager (for 8 out 10 participants this led to a better white-gray matter segmentation), aligning the images in AC-PC plane and converting to Talairach space, white-gray matter segmentation. After these steps a 3D cortical mesh was created for each subject and the resulting individual meshes were morphed and aligned using cortical surface information (sulci and gyri). Finally an average mesh was created and inflated. Preprocessing steps on functional images included slice acquisition time correction, motion correction, linear trend removal and high pass filter (temporal). The resulting functional images were coregistered with the anatomical images per participant. Functional data were spatially transformed and projected on the inflated 3D average mesh for further analyses. These analyses included fixed-effects surface based group analyses using the general linear model (GLM). Specifically we performed several separate GLM analyses (material vs. scrambled, material vs. 3D; 3D vs. scrambled) and a conjunction analysis of (material>scrambled) AND (material>3D). Active voxels were identified at qFDR > 0.05 level (FDR corrected for multiple comparisons).
RESULTS
Figure 2 shows the results of the GLM analyses. The activation maps on the left and on the right show that, overall, there are many cortical areas that have a larger BOLD response to dynamic dot materials (yellow) than to scrambled motion stimuli. Moreover, cortical areas with a material “preference” appear to constitute a superset of those that respond to 3D motion (Figure 2, left panel, showing areas - in orange - that respond more to 3D rigid motion than to scrambled stimuli). The overlap in activation for these two contrasts (material vs. scrambled and 3D vs. scrambled) is quite expected because 3D rigid motion could simply be a special type of material motion and thus activity to this type of stimuli should be contained within the general material motion network. The fact that rigid motion is a subclass of all possible material motions might also explain why cortical responses to dynamic dot materials are almost always stronger than those to 3D motion (Figure 2, right panel, orange color maps): seeing just one type of material motion is likely to cause a relatively weaker cortical response than the rich set of motions that occur in the dynamic dot condition. We also found these patterns of activation at the individual participant level (Supplementary Figure 3). The average accuracy in the fixation task was 83% across observers, suggesting that participants followed our instructions and fixated at the center of the screen during presentation of the stimuli.
The left panel shows the results of the GLM contrasting average BOLD responses to dynamic dot materials with those to scrambled motion stimuli (yellow-blue), overlaid together with results of a GLM analysis that contrasts average BOLD responses to 3D rigid motion with those to scrambled stimuli (orange-black). The resulting activity maps suggest that cortical areas that respond strongly to material motion are a superset of those that respond to 3D rigid motion. In the right panel we plot the results of the same dynamic dot materials vs scrambled motion contrast (yellow-blue) together with a GLM contrast of dynamic dot materials versus 3D rigid motion (orange-black). Here, we see that responses to dynamic dot materials were almost always stronger than those to 3D rigid motion stimuli. See text for further details. Overall we see that lower visual areas tended to respond stronger to the scrambled motion stimuli. This is consistent with the literature (e.g. Murray et al. 2002).
A subsequent conjunction analysis confirmed that there is a widespread network of brain areas whose BOLD response is stronger for dynamic dot stimuli compared to other types of motion (scrambled and 3D rigid motion), including object-, texture-, and motion-selective regions, as well as somatosensory/multisensory and premotor areas (Figure 3). We will discuss these results next.
Cortical areas that respond stronger (hot colors) and weaker (cool colors) to material motion compared to both 3D motion and scrambled motion. A large network of areas are more strongly active under the material motion condition.
DISCUSSION
Dynamic dot materials are a novel class of stimuli that convey non-optical properties of material qualities purely on the basis of 2D image motion patterns. Here, we introduced a database of various types of non-rigid materials, though other materials (e.g. rigid, breakable substances) can also be rendered convingly by the means of this technique (e.g. see Schmid & Doerschner, 2018). In their creation these stimuli are conceptually closely related to “point light walkers” (Johannson, 1973), where the motion of small light sources affixed to different parts of limbs of biological species can elicit a very vivid impression of animacy. Similarly, we “attached” small dots to an otherwise invisible substance and recorded the motion of these dots while the material reacted to a force. How individual materials change their shape in response to a force strongly depends on their mechanical properties, and it is this idiosyncratic change of shape information that appears to convey mechanical qualities of a material. It would be very interesting to pinpoint exactly the motion characteristics that elicit a particular material quality, (Schmid & Doerschner, 2018, also see Bi et al. 2019), just as it has been done in the field of biological motion where researchers have tried to understand what it is that makes point light walkers look “biological” (Treue, 2013). In fact, all biological motion is also non-rigid motion and thus our stimuli could be suitable to tease apart the contributions of these two factors to neural activity observed while watching biological motion stimuli. Our stimuli could help to discover the perceptual boundary between nonrigid animate and inanimate objects, and to investigate corresponding neural maps and mechanisms (Long, Yu, Konkle, 2018; Grill-Spector & Weiner, 2014). Dynamic dot materials also allow to investigate whether areas previously associated with materials (e.g. CoS, e.g. Cant & Goodale 2007; Cant & Xu, 2012, 2015, 2016; Gallivan et al. 2014; Eck et al., 2016; Kitada et al. 2014) are indeed specialized for processing visual (optical) properties or whether they represent material properties more generally (our results so far suggest the latter, but multivariate design would allow one to better test this). The fact that the location, direction and speed of individual dots in our stimuli can be manipulated, renders investigations of research questions like these more feasible.
Using these novel stimuli in an fMRI experiment we found robust and widespread increased activation across the human brain in response to dynamic dot materials when compared to activation in response to other types of motion stimuli (Figure 3). Regions preferring dynamic dot materials included several areas in occipito-temporal and, -parietal cortices, secondary somatosensory cortex, and premotor regions. Given the properties of our stimuli, a widespread cortical preference is perhaps expected: our stimuli are objects (LOC e.g. Grill-Spector & Malach, 2004), they are non-rigidly moving structures (e.g., hMT+, MST, PPC e.g. see review by Erlikhman, et al. 2018; STS e.g. Saygin et al), they elicit a distinct tactile experience (e.g. Schmid & Doerschner, 2018; Bi et al., 2019)), and such tactile experiences of material qualities are often associated with certain optical material qualities (CoS e.g. Arnott et al., 2008; Podrebarac et al., 2014; Sun et al., 2016a and see Komatsu & Goda, 2018 for a review). The widespread activation in response to quite sparse stimuli also suggests that material perception must be an inherently distributed process (see Schmid & Doerschner 2019). Consistent with this idea, there is a growing body of literature suggesting that object category representations are grounded in distributed networks (e.g. Martin, 2016).
Discovering a network of preferentially more active areas during dynamic dot material viewing does not mean that all of these areas must be involved in the recognition and differentiation of materials: to find such finer-tuned responses would require further studies, and our stimuli provide a convenient way to investigate this as discussed next.
Why studying responses to visually presented materials?
Materials are inherently multidimensional in that they have multiple stimulus properties in multiple perceptual dimensions, and they are inherently multimodal in that their properties can be inferred through multiple modalities. For example a velvet cloth looks soft (optical properties), moves (visual mechanical motion) and folds (visual 3D shape) in a way that suggests that it is soft, but it also feels soft to the touch (tactile). This multidimensionality and multimodality makes materials the ideal candidate to develop experimental designs that can help to understand computational architecture of the cortical representations involved in recognition. As an example, if a cortical region represents material/object category A based on visual property X but not the same category based on the visual property Y then this suggests that this region encodes the visual property X but not the category. Conversely, if this cortical region has a shared representational structure, i.e. category A is encoded through both visual properties X and Y, then it likely encodes the concept. Thus, investigating neural responses to stimuli, like the dynamic dot materials proposed here, we may be able to tease apart the direct encoding of visual properties from the indirect activation of associated properties (Schmid & Doerschner, 2019).
CONCLUSION
Dynamic dot materials are a novel class of stimuli that convey non-optical properties of material qualities purely on the basis of 2D image motion patterns. They can be used for mapping the cortical network involved in the perception of material qualities. From a broader perspective, owing to their inherently multidimensional and multimodal nature, materials are a unique type of stimulus that can help neuroimaging research to advance our understanding of the computational architecture of the cortical representations involved in recognition.
SUPPLEMENTARY
Motion energy (velocity magnitude) for each condition. Y axis units are based on dot positions being normalised between 0 and 1. A. Velocity magnitude for averaged across particles for each frame of each movie. B. Velocity magnitude averaged across particles and frames for each movie. C. Velocity magnitude averaged across particles, frames, and materials for each condition.
Motion energy (acceleration magnitude) for each condition. Y axis units are based on dot positions being normalised between 0 and 1. A. Acceleration magnitude for averaged across particles for each frame of each movie. B. Acceleration magnitude averaged across particles and frames for each movie. C. Acceleration magnitude averaged across particles, frames, and materials for each condition.
GLM results from two participants. Compare to Figure 2 in the main text.
ACKNOWLEDGEMENTS
A.S. and K.D. are supported by a Sofja Kovalevskaja Award endowed by the German Federal Ministry of Education, awarded to K.D. H.B. and K.D. are also supported by a Marie Sklodowska-Curie Action – Innovative Training Network (MSCA-ITN/ETN) Grant, DyViTo: Dynamics in Vision and Touch – the look and feel of stuff. H.B. was also supported by a TUBITAK (The Scientific and Technological Research Council of Turkey) 1001 Research Grant (217K163).