Body shape as a visual feature: evidence from spatially-global attentional modulation in human visual cortex

Feature-based attention modulates visual processing beyond the focus of spatial attention. Previous work has reported such spatially-global effects for low-level features such as color and orientation, as well as for faces. Here, using fMRI, we provide evidence for spatially-global attentional modulation for human bodies. Participants were cued to search for one of six object categories in two vertically-aligned images. Two additional, horizontally-aligned, images were simultaneously presented but were never task-relevant across three experimental sessions. Analyses time-locked to the objects presented in these task-irrelevant images revealed that responses evoked by body silhouettes were modulated by the participants’ top-down attentional set, becoming more body-selective when participants searched for bodies in the task-relevant images. These effects were observed both in univariate analyses of the body-selective cortex and in multivariate analyses of the object-selective visual cortex. Additional analyses showed that this modulation reflected response gain rather than a bias induced by the cues, and that it reflected enhancement of body responses rather than suppression of non-body responses. These findings provide evidence for a spatially-global attention mechanism for body shapes, supporting the rapid and parallel detection of conspecifics in our environment.


40
The capacity limits of the human visual system require selecting visual input for further 41 processing and conscious access (Carrasco, 2011;Chun et al., 2011). One way to do this is to 42 select specific locations of the visual field through spatial attention and eye movements.

43
However, when searching for task-relevant objects in our environment, the location of these 44 objects is typically not yet known. In this case, selection may operate at the level of visual 45 features, using a selection mechanism termed feature-based attention (Maunsell and Treue, 46 2006). To be an effective selection mechanism, feature-based attention would need to operate 47 in parallel across the whole or part of the visual field, in order to then guide spatial attention to 48 the location of the target object (Wolfe, 1994). While this could be a plausible mechanism of 49 attentional selection, it raises a core question: what are the features of feature-based attention?

50
At a neural level, it has been proposed that feature-based attention may be restricted to 51 features to which sensory neurons are systematically tuned (Maunsell and Treue, 2006).

65
In the present study, we tested whether global attentional modulation can similarly be 66 observed for the shape of the human body, a category of high social and biological significance 67 that is selectively represented in high-level visual cortex (Downing et al., 2001; 68 Downing, 2005). Behavioral studies have shown that bodies, like faces, gain preferential access 69 to awareness (Stein et al., 2012) and automatically attract attention (Downing et al., 2004;Ro et

116
In the main experiment, on each trial, the display contained two boxes in the horizontal and 117 vertical locations (Fig. 1). The vertical boxes had a white bounding frame, signifying their 118 relevance. Each of the four boxes contained a random image containing the average power 119 spectrum of the objects from the six categories with random phases. Objects were mixed with 120 these random images. On each trial, an exemplar from one of the six categories could be 121 presented in one of the two vertical boxes (1/7 probability each) or no object would be 122 presented (1/7 probability). Simultaneously, an exemplar from one of the six categories could be 123 presented in both the horizontal boxes (1/7 probability each) or no object would be presented

124
(1/7 probability). Each block consisted of 49 trials to fill the co-occurrence matrix of the 125 horizontal and vertical object conditions, such that the conditions presented in the horizontal and 126 vertical boxes were orthogonal to each other.

127
In each block of the main experiment, participants would either search for one of the six 128 categories in the vertical boxes or would detect a thickening of the frames of the bounding 7 boxes in the vertical location (for trial layout, see Fig. 1). Participants pressed the response 130 button when the cued object category was shown in one of the vertical locations, which 131 occurred on 7/49 trials. Participants had to respond within 1.2s. The brief presentation duration 132 (67 ms) required participants to maintain fixation to be able to detect the target in one of the two 133 vertical locations. Participants were instructed that they could ignore the objects presented at

144
In addition to the main experiment, participants completed a "baseline" experiment. This

159
Each participant attended three experimental sessions. The first behavioral session 160 required each participant to get exposed to the entire set of objects followed by the completion

189
The functional data were analyzed using MATLAB (2017a) and SPM12. During

208
In the univariate analysis, the regression weights (betas) from the GLM were compared 209 between conditions after averaging across the voxels of a region of interest (ROI). In the 210 multivariate analysis, the pattern of betas from the GLM across the voxels of an ROI were 211 compared between conditions using Kendall#s tau correlation coefficient (τ) as a metric for 212 similarity. Before comparing the betas between the main and baseline experiments, the data 213 were mean-centered: the mean across all main experiment condition betas was subtracted from 214 those condition betas (separately for each voxel), and the mean across all baseline experiment 215 condition betas were subtracted from those condition betas.

217
All ROIs were defined across both hemispheres (except FBA, which was limited to the right 218 hemisphere). In the multivariate analysis, we focused on two ROIs, the lateral-occipital cortex 219 (LOC) and the early visual cortex (EVC). The LOC ROI was defined using a group-constrained 220 subject-specific method (Fedorenko et al., 2010). The group-level ROI was defined by first

232
In the univariate analysis we focused on two body-selective ROIs, the extrastriate body

234
The ROIs were defined using the method described above for LOC. The group-level ROI was In the multivariate analyses, we correlated multivoxel activity patterns evoked by the task-246 irrelevant objects in the main experiment with multivoxel activity patterns evoked by the clearly 247 visible objects in the baseline experiment, using Kendall rank-ordered correlation; . We expect 248 to find stronger correlations between corresponding object categories (e.g., between bodies in 249 the main experiment and bodies in the baseline experiment), than between non-corresponding 250 categories (e.g. between bodies in the main experiment and beds in the baseline experiment).

251
As such, the difference between corresponding and non-corresponding category correlations is

271
In the main experiment, participants detected the presence of object silhouettes belonging to 272 one of six categories (Fig. 1D), in different blocks. Throughout the experiment, only the 273 vertically-aligned locations were relevant for the detection task (Fig. 1A). Each block started with 274 a category cue (e.g. "Car") indicating the target category for that block (Fig. 1A), followed by 49 275 object detection trials. In 42 trials (6/7th), one of the two task-relevant locations contained a 276 briefly-presented object (67 ms) within phase-scrambled noise (Fig. 1B), with each category 277 presented equally often (7 trials each). In the remaining 7 trials (1/7th) no object was presented.

278
Crucially, in 6/7th of the trials, two objects were simultaneously presented in the 279 horizontally-aligned locations (Fig. 1A). These objects were briefly presented (67 ms),

310
The difference between body and non-body stimuli within each block is a measure of body 311 selectivity.

373
The attention effect for bodies in LOC could reflect enhanced proximity to bodies for the 374 bodies presented in body detection blocks, but may also (or additionally) reflect reduced 375 proximity to bodies (suppression) for the other categories presented in body detection blocks.

376
To test for body-selective enhancement, we compared the proximity (to bodies in the baseline

416
However, attentional modulation was significant even in the non-selective ROI (t21 = 2.1, p = 417 0.047, d = 0.46). These results suggest that the attentional modulation in LOC was partly but 418 not exclusively driven by body-selective voxels.

420
Attentional modulation for non-body categories in LOC

421
Using the multivariate analysis framework outlined above for bodies, we can similarly test for

426
Selective proximity is the proximity difference between the corresponding and non-427 corresponding categories (e.g., the difference between the two left-most data points in Fig. 3A).

428
As an intuition for what this new measure represents, note that in the case of bodies, selective 429 proximity is analogous to the body selectivity measure in the univariate analysis.

437
Attentional modulation in EVC

438
The

529
The attentional effects observed here for body silhouettes are unlikely to reflect attention 530 to low-level features such as orientation or color, for several reasons. First, we included a 531 relatively large number of object categories in the experiment to ensure that participants could 532 not detect objects based on low-level features, as these were shared with other categories (e.g.,

533
bottles were vertical, similar to bodies). Second, we presented object silhouettes instead of 534 photographs to avoid possible low-level differences between categories in texture or color.

28
Taking everything together, the evidence suggests that features that are diagnostic of 574 bodies meet many of the previously proposed criteria for basic features: showing spatially-global 575 attentional modulation (Maunsell and Treue, 2006), being processed "early, automatically, and 576 in parallel across the visual field" (Treisman and Gelade, 1980), and being represented 577 selectively in the visual system (Treisman, 2006). Indeed, Treisman (2006) proposed that the

593
To conclude, the current results provide evidence for spatially-global attentional 594 modulation for human bodies in high-level visual cortex, linking this modulation to body-selective 595 representations in univariate and multivariate analyses. Combining these results with previous 596 behavioral and neuroimaging studies, we propose that bodies may be processed as basic 597 features, supporting the rapid and parallel detection of conspecifics in our environment even 598 outside the focus of spatial attention.