Research shows that object-location binding errors can occur in VWM indicating a failure to store bound representations rather than mere forgetting (Bays et al., 2009; Pertzov et. al. 2012). Here we investigated how categorical similarity between real-world objects influences the probability of object-location binding errors. Our observers memorized three objects (image set: Konkle et. al. 2010) presented for 3 seconds and located around an invisible circumference. After a 1-second delay they had to (1) locate one of those objects on the circumference according to its original position (localization task), or (2) recognize an old object when paired with a new object (recognition task). On each trial, three encoded objects could be drawn from a same category or different categories, providing two levels of categorical similarity. For the localization task, we used the mixture model (Zhang & Luck, 2008) with swap (Bays et al., 2009) to estimate the probabilities of correct and swapped object-location conjunctions, as well as the precision of localization, and guess rate (locations are forgotten). We found that categorical similarity had no effect on localization precision and guess rate. However, the observers made more swaps when the encoded objects have been drawn from the same category. Importantly, there were no correlations between the probabilities of these binding errors and probabilities of false recognition in the recognition task, which suggests that the binding errors cannot be explained solely by poor memory for objects. Rather, remembering objects and binding them to locations appear to be partially distinct processes. We suggest that categorical similarity impairs an ability to store objects attached to their locations in VWM.
Increased distractor heterogeneity complicates visual search, but only when the set of distractors has high dissimilarity. However, if a gap between those dissimilar distractors in the feature space is filled with numerous intermediate feature values, it paradoxically improves the salience of a target singleton despite increased distractor heterogeneity. To explain this paradox we suggested that the distractor heterogeneity effect is mediated by "segmentability". This predicts different heterogeneity effects on singleton search depending on the smoothness of transition between neighboring features.
Ensemble summary statistics represent multiple objects on the high level of abstraction—that is, without representing individual features and ignoring spatial organization. This makes them especially useful for the rapid visual categorization of multiple objects of different types that are intermixed in space. Rapid categorization implies our ability to judge at one brief glance whether all visible objects represent different types or just variants of one type. A framework presented here states that processes resembling statistical tests can underlie that categorization. At an early stage (primary categorization), when independent ensemble properties are distributed along a single sensory dimension, the shape of that distribution is tested in order to establish whether all features can be represented by a single or multiple peaks. When primary categories are separated, the visual system either reiterates the shape test to recognize subcategories (indepth processing) or implements mean comparison tests to match several primary categories along a new dimension. Rapid categorization is not free from processing limitations; the role of selective attention in categorization is discussed in light of these limitations.
The visual system can represent multiple objects in a compressed form of ensemble summary statistics (such as object numerosity, mean, and feature variance/range). Yet the relationships between the different types of visual statistics remain relatively unclear. Here, we tested whether two summaries (mean and numerosity, or mean and range) are calculated independently from each other and in parallel. Our participants performed dual tasks requiring a report about two summaries in each trial, and single tasks requiring a report about one of the summaries. We estimated trial-by-trial correlations between the precision of reports as well as correlations across observers. Both analyses showed the absence of correlations between different types of ensemble statistics, suggesting their independence. We also found no decrement (except that related to the order of report explained by memory retrieval) in performance in dual compared to single tasks, which suggests that two statistics of one ensemble can be processed in parallel.
An uninformative exogenous cue speeds target detection if cue and target appear in the same location separated by a brief temporal interval. This finding is usually ascribed to the orienting of spatial attention to the cued location. Here we examine the role of perceptual merging of the two trial events in speeded target detection. That is, the cue and target may be perceived as a single event when they appear in the same location. If so, cueing effects could reflect, in part, the binding of the perceived target onset to the earlier cue onset. We observed the traditional facilitation of cued over uncued targets and asked the same observers to judge target onset time by noting the time on a clock when the target appeared. Observers consistently judged the onset time of the target as being earlier than it appeared with cued targets judged as earlier than uncued targets. When the event order is reversed so that the target precedes the cue, perceived onset is accurate in both cued and uncued locations. This pattern of results suggests that perceptual merging does occur in exogenous cueing. A modified attention account is discussed that proposes reentrant processing, evident through perceptual merging, as the underlying mechanism of reflexive orienting of attention.
When storing multiple objects in visual working memory, observers sometimes misattribute perceived features to incorrect locations or objects. These "swaps" are usually explained by a failure to store object representations in a bound form. Swap errors have been demonstrated mostly in simple objects whose features (color, orientation, shape) are easy to encode independently. Here, we tested whether similar swaps can occur with real-world objects where the connections between features are meaningful. In Experiment 1, observers were simultaneously shown four items from two object categories (two exemplars per category). Within a category, the exemplars could be presented in either the same (two open boxes) or different states (one open, one closed box). After a delay, two exemplars drawn from one category were shown in both possible states. Participants had to recognize which exemplar went with which state. In a control task, they had to recognize two old vs. two new exemplars. Participants showed good memory for exemplars when no binding was required. However, when the tested objects were shown in the different states, participants were less accurate. Good memory for state information and for exemplar information on their own, with a significant memory decrement for exemplar-state combinations suggest that binding was difficult for observers and "swap" errors occurred even for real-world objects. In Experiment 2 we used the same tasks, but on half of trials the locations of the exemplars were swapped at test. We found that participants ascribed incorrect states to exemplars more frequently when the locations were swapped. We conclude that the internal features of real-world objects are not perfectly bound in VWM and can be attached to locations independently. Overall, we provide evidence that even real-world objects are not stored in an entirely bound representation in working memory.
Observers are good at rapid estimation of the average size of multiple objects (Ariely, 2001; Chong & Treisman, 2003). We tested whether the average is calculated along a "raw" (proximal) stimulus size (where only visual angle is important) or relies on the distal size of an object (which requires taking distance information into account). Our participants performed the size averaging task adjusting the size of a probe circle. Using a stereoscope, we changed the apparent distance of ensemble members from the observer. In Experiment 1, all ensemble members shifted by the same disparity angle in both eyes, so that they seemed at different distances but always in one plane. The probe was always in a same plane (zero disparity). We found that presenting ensembles in apparently remote planes made observers to overestimate their mean size in comparison to what is expected from simple visual angle averaging. In Experiment 2, ensemble members were presented at different planes so that (1) visual angle reduced with the apparent distance, making apparent sizes of individual members more similar, (2) visual angle increased with the apparent distance, increasing this apparent dissimilarity, and (3) all members were presented at the zero disparity plane. We found that the mean error in probe averaging in condition (1) was significantly smaller than in other conditions. This finding is in line with previous studies also showing that similarity between ensemble members in one plane reduce the error. As the items in condition (1) could look more similar than in the others only due to the distance cues, we conclude that observers took into these cues into account. Our main theoretical conclusion is that the visual system appears to work with bound objects rather than their separate features when representing their global properties such as the average size.
Meeting abstract presented at VSS 2016
The visual search for multiple targets can cause errors called subsequent search misses (SSM) – a decrease in accuracy at detecting a second target after a first target has been found (e.g. Adamo, Cain, & Mitroff, 2013). One of the possible explanations is perceptual set. After the first target is found, the subject becomes biased to find perceptually similar targets, therefore he is more likely to find perceptually similar targets and less likely to find the targets that are perceptually dissimilar. The experiment investigated the role of perceptual similarity in SSM errors. The search array in each trial consisted of 20 stimuli (ellipses and crosses, black and white, small and big, oriented horizontally and vertically), which could contain one, two or no targets. In case of two targets, they could have two, three or four shared features (in the last case they were identical). The features of target stimuli were indicated at the beginning of each trial. Participant's task was to find all the target stimuli or report their absence. Accuracy for conditions with two stimuli with two, three, four shared features and with one stimulus was compared. In case of two targets the correct answer assumed finding both stimuli. Repeated measures ANOVA revealed the main effect of shared features factor, F(1, 19) = 15.71, p = .000. Pairwise comparisons (with Holm-Bonferroni adjustment) revealed significant differences between all the conditions, except the conditions with one stimulus and two identical stimuli. SSM errors were found for all conditions, except fully identical stimuli condition. The size of SSM effect decreased with increasing the similarity between the targets. The results indicate the role of perceptual similarity and have implications for the perceptual set theory.
Effects of display heterogeneity on visual search efficiency are well documented (Duncan & Humphreys, 1989). Even when searching a clearly distinguishable feature singleton, attentional salience falls down with heterogeneity of distractors (e.g., Santhi & Reeves, 2004). It is presumed that the visual system is able to preattentively separate heterogeneous features to homogenous subsets and attend each subset serially to find a singleton. The issue we addressed in our study was as follows: How does the visual system process heterogeneous sets that can’t be clearly distinguished? Theoretically, it should conjoin all heterogeneous items under the same subset representation and a singleton, therefore, would become more salient despite large heterogeneity. In our visual search task observers searched for an odd-sized target (either small, or large) among 13, 25, or 37 differently sized items. There were two homogenous conditions: (1) all distractors were of medium or (2) opposite size (e.g., large distractors with small targets and vice versa). Above, two heterogenous conditions were tested. In one such conditions all distractors were of (3) medium and opposite sizes (the difference between medium and each opposite size were clearly distinguishable ). Finally, in condition (4) four transition sizes filled the gap between medium and opposite distractors providing six heterogeneous sizes. We found in the result near parallel pattern of search performance in all positive conditions. The fastest detection was predictably found for homogenous displays with opposite sizes. The slowest detection was found for two distinct sizes of distractors. The intermediate efficiency was found for both medium homogenous and heterogeneous sets with transition sizes. RTs were substantially the same in these two conditions. This suggests that the visual system does fail to separate such transitional sets to subsets and treat them as a unitary perceptual entity opposing to a singleton (despite large heterogeneity and wide range of differences).
The word superiority effect (Cattell, 1886) is discussed in psychology for more than a century. However, a question remains whether automatic word processing is possible without its spatial segregation. Our previous studies of letter search in large letter arrays containing words without spatial segregation revealed no difference in performance and eye movements when observers searched for letters always embedded in words, never embedded in words, or when there were no words in the array (Falikman, 2014; Falikman, Yazykov, 2015). Yet both the percentage of participants who noticed words during letter search and their subjective reports whether words made search easier or harder significantly differed for target letters within words and target letters out of words. In the current study, we used the Processes Dissociation Procedure (Jacoby, 1991) to investigate whether words are processed implicitly when observers search for letters. Two groups of participants, 40 subjects each, performed 1-minute search for 24 target letters (either Ts, always within words, or Hs, always out of words) in the same letter array of 10 pseudorandom letter strings, 60 letters each, containing 24 Russian mid-frequency nouns. After that, they filled in two identical word-stem completion forms, each containing the same 48 word beginnings (24 for words included in the array). First, the participants were instructed to use words that could appear in the search array ("inclusion test"), then – to avoid using such words ("exclusion test"). Comparison of conscious and unconscious processing probabilities revealed no difference between them (with the former not exceeding 0.09 and the latter not exceeding 0.11), no difference between the two conditions, and no interaction between the factors. This allows concluding that, despite of subjective reports, words embedded in random letter strings are mostly not processed either explicitly or implicitly during letter search, and that automatic unitization requires spatial segregation.
Four experiments were performed to examine the hypothesis that abstract, nonspatial, statistical representations of object numerosity can be used for attentional guidance in a feature search task.Participants searched for an odd-colored target among distractors of one, two, or three other colors. An enduring advantage of large over small sets (i.e., negative slopes of search functions) was found, and this advantage grew with the number of colored subsets among distractors. The results of Experiments 1 and 2 showed that the negative slopes cannot be ascribed to the spatial grouping between distractors but can be partially explained by the spatial density of the visual sets. Hence, it appears that observers relied on numerosity of subsets to guide attention. Experiments 3a and 3b tested the processes within and between color subsets of distractors more precisely. It was found that the visual system collects numerosity statistics that can be used for guidance within each subset independently. However, each subset representation should be serially selected by attention. As attention shifts from one subset to another, the “statistical power” effects from every single subset are accumulated to provide a more pronounced negative slope.