A blog about problems in the field of psychology and attempts to fix them.

Wednesday, November 22, 2017

Modularity and the study of visual perception - Marr and Gibson

Gibson’s 1966 book The Senses Considered as Perceptual Systems recently turned 50. Two issues of the journal Ecological Psychology commemorated that event (here, and here). This is the second in a series of posts reviewing those contributions.

Vision research was impacted tremendously by the short career of David Marr. Marr was tremendously impacted by James J. Gibson, though mostly by Gibson's earlier work on optic flow, and not by his later works that birthed Ecological Psychology. Marr was incredibly influential in the move towards thinking of vision (and neuroscience in general) as "modular", while most of Gibson's work would lead one away from modular thinking. It is this tension that motivates Sedgwick and Gillam's article "A Non-Modular Approach to Visual Space Perception."

The article starts with a brief history of the argument for modularity in vision, which is well laid out. In summary: Most approaches to visual perception start by assuming there are a very small number of "visual primitives" (generally ala Descartes). This leaves a very limited set of ways by which an organism can mentally construct a surrounding world (and those ways are agreed upon via deductive reasoning, rather than empirically investigation). Those lists then become something of a self-fulfilling prophecy, as vision researchers and computer scientists demonstrate that such cues can be used by computers and/or research participants with some degree of accuracy.  As this work matured and developed, the idea that there were several, quite-independent methods for getting at depth remained a foundational assumption.
As these various models of depth or distance processing were developed, the question naturally arose as to how they were related to each other. The answer suggested by Marr (1976) is that each process is to be regarded as a separate module, having only very limited communication with the other modules in the system. Marr borrowed this concept directly from computer science, arguing that the design principles applicable to computer programming are also applicable to the biological processes of vision, which he viewed as computations. Marr stated his principle of modular design as follows: “Any large computation should be split up and implemented as a collection of small sub-parts that are as nearly independent of one another as the overall task allows” (Marr, 1976, p. 485).
Marr's influence on mainstream thinking about vision science is hard to underestimate. I remember a graduate seminar in which we went through Marr's posthumous 1982 book, coincidentally in the middle of becoming enamored with Ecological Psychology. Despite the strong influence that Gibson's early work had on Marr, the comparison in proposed systems is stark. As Sedgwick and Gillam lay out deftly, Marr does not marshal empirical evidence to support his assertions regarding the modular nature of visual processing in the brain, he asserts it primarily based off of historic momentum and the reader's intuitive ability to conceive of the different "computational processes" independently of each other. (The Psychologist's Fallacy strikes again!) To the extent that Marr spells out more detailed arguments in favor of the modular approach, the article deals with them quickly and convincingly.

The challenges that face a modular approach are significant. Sedgwick and Gillam go into some depth support the existence of two serious dilemmas.
First, those sources of information that are most frequently designated as modules do not, on closer examination, have the individual coherence that the modular approach requires. Second, some important sources of information for spatial layout tend to be left out of discussions of modularity because they do not lend themselves to the modular approach.
But once the weaknesses of the modular approach are exposed, what is the alternative?!? I will admit that I have had colleagues who don't even seem to be able to fathom that there could be an alternative. For that, Sedgwick and Gillam start with Gibson's approach to vision, but they do not stop with him. Gibson's crucial contributions include his appreciation for the complexity of optic arrays (patterns of light "out there" in the environment, whether viewed by an organism or not) and his understanding of the importance of the active perceiver (whereas most theorists from Descartes to Marr to the present focus on passive perceiver).

This is followed by an impressive summary of research findings 1) supporting a surface-focused approach to distance perception (rather than an abstracted distance-from-observer approach), 2) explaining the outcomes of experiments that vary local and non-local aspects of visual figure (such that simplistic modular-process explanations fail), 3) looking at ways in which higher-order variables of sheer and compression seem better candidates for the basis of perceiving slanted surfaces than more traditional suggestions which are often confounded with those variables, and 4) demonstrating ways in which supposedly independent cues for depth are dramatically influenced by a context that allows the viewer to make relative (rather than absolute) judgements. There is also a very nice discussion of how Ecological Optics (focusing on the structure of light external to the organism) offers an improved way to think about the relationship between "binocular" and "monocular" information for depth, which has implications for the types of structures visual neuroscientists might expect to be able to find.

At times, the intuitive pull of modular thinking might lead the readers to think "that thing they just presented, maybe there is a module for that." Sedgwick and Gillam do a decent job trying to explain the difficulty of such an approach, as a "module" for a much higher-order process, which interacts continuously with other higher-order processes, is not a module at all (not an isolated process serving a simple-and-discrete function). That is, it all supports the suggestion that modular thinking about depth perception has impeded researchers from conducting empirical investigation that would uncover a wealth of non-modular factors that affect vision.

Overall: A very good and informative read. A bit challenging towards the end, but worth the effort, especially if one is interested in getting a feel for what "non-modular" approaches to vision research might look like. In the spirit of anti-modularity, I would say that there is not a most valuable part of the paper; rather, it's core value is in the overarching narrative regarding the seductiveness of the modularity idea, specifically how vision scientists have turned "modularity" into a self-reinforcing idea, rather than keeping it held as a tentative hypothesis, worthy of testing in its own right.


Sedgwick, H. A., & Gillam, B. (2017). A Non-Modular Approach to Visual Space Perception. Ecological Psychology, 29, 72-94.

No comments:

Post a Comment