Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Crowding and the Architecture of the Visual System
 
doctoral thesis

Crowding and the Architecture of the Visual System

Doerig, Adrien Christophe  
2020

Classically, vision is seen as a cascade of local, feedforward computations. This framework has been tremendously successful, inspiring a wide range of ground-breaking findings in neuroscience and computer vision. Recently, feedforward Convolutional Neural Networks (ffCNNs), a kind of deep neural network inspired by this classic framework, have revolutionized computer vision and been adopted as tools in neuroscience. However, despite these successes, there is much more to vision. First, there are flagrant architectural differences between biological systems and the classic framework. For example, recurrence is abundant in the brain but absent from the classic framework and ffCNNs. Although there is widespread agreement about the importance of these recurrent connections, their computational role is still poorly understood. Second, these architectural differences lead to behavioural differences too, highlighted by psychophysical evidence. Relatedly, ffCNNs are extremely vulnerable to small changes to their inputs and do not generalize well beyond the dataset used to train them. Human vision, in contrast, is much more robust. New insights are needed to face up to these challenges. In this thesis, I use visual crowding and related psychophysical effects as probes into visual processes that go beyond the classic framework. In crowding, perception of a target deteriorates in clutter. I focus on global aspects of crowding, in which perception of a small target is strongly modulated by the global configuration of elements across the visual field. I show that models based on the classic framework, including ffCNNs, cannot explain these effects for principled reasons and identify recurrent grouping and segmentation as a key missing ingredient. Then, I show that capsule networks, a recent kind of deep learning architecture combining the power of ffCNNs with recurrent grouping and segmentation, naturally explain these effects. I provide psychophysical evidence that humans indeed use a similar recurrent grouping and segmentation strategy in global crowding effects. In crowding, visual elements interfere across space. To study how elements interfere over time, I use the Sequential Metacontrast psychophysical paradigm, in which perception of visual elements depends on elements presented hundreds of milliseconds later. I psychophysically characterize the temporal structure of this interference and propose a simple computational model. My results support the idea that perception is a discrete process. I lay out theoretical implications of these findings. Together, the results presented here provide stepping-stones towards a fuller understanding of the visual system by suggesting architectural changes needed for more human-like neural computations.

  • Files
  • Details
  • Metrics
Type
doctoral thesis
DOI
10.5075/epfl-thesis-7582
Author(s)
Doerig, Adrien Christophe  
Advisors
Herzog, Michael  
Jury

Prof. Wulfram Gerstner (président) ; Prof. Michael Herzog (directeur de thèse) ; Prof. Carmen Sandi, Prof. Thomas Serre, Prof. Felix Wichmann (rapporteurs)

Date Issued

2020

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2020-02-07

Thesis number

7582

Total of pages

282

Subjects

Vision

•

Computational models

•

Neural networks

•

Crowding

•

Discrete perception

EPFL units
LPSY  
Faculty
SV  
School
BMI  
Doctoral School
EDNE  
Available on Infoscience
February 4, 2020
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/165129
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés