Critical AI

SNEAK PREVIEW: REVIEW: JAMES E. DOBSON’S “THE BIRTH OF COMPUTER VISION”

Nicolas Malevé and Katrina Sluis

[Critical AI 2.2 is a special issue, co-edited by Lauren M.E. Goodlad and Matthew Stone, collecting interdisciplinary essays and think pieces on a wide range of topics involving Large Language Models. Below, a sneak preview from the issue: Nicolas Malevé and Katrina Sluis‘ thought-provoking “Review: James E. Dobson’s The Birth of Computer Vision“]

The need to automate human sight and outsource it to machines has become a pressing problem in contemporary society, with the dramatic scaling up of image production by the military, government, sciences, medicine, and the wider public. While the development of computer vision has been the object of intensive research in computer sciences for several decades, it is increasingly scrutinized in disciplines such as law, media and cultural studies, visual arts, and digital humanities to name a few. However, for all the attention it garners, and its critical role reconfiguring the relations between seeing and knowing, little has been written about its history. For this reason alone, a book that attempts to trace the “birth” of computer vision should attract our interest.  

The Birth of Computer Vision by James E. Dobson offers insights into a discipline in formation, covering an arc from the 1950s to the 1970s, a time when many of today’s computer vision concepts and methods were conceived. Given this timeline, the book sits in dialogue with a constellation of other texts engaged with cybernetics in the post-war period such as The Closed World by Paul N Edwards (1996), or The Ontology of the Enemy by Peter Galison (1994), to name a few. However, in these texts computer vision rarely takes center stage. With The Birth of Computer Vision, Dobson seeks to excavate the “pre-existing knowledge and perceptions that are imagined, framed, experienced and described by humans” as a precondition for computer vision, illuminating the entanglement of human and machine ways of seeing. In order to resist positioning algorithms as ahistorical abstractions, Dobson’s genealogy opposes the computer industry’s discourse of ever-expanding technological progress, in which each development is positioned as a moment of rupture with the past. This account is significantly enriched by Dobson’s programming expertise, enabling him to critically appraise the technical implementation, contexts, and uses of computer vision algorithms in practice. Through a series of case studies, Dobson guides readers through the iterative steps taken to solve the automatization of perception, and the photographic material mobilized as training data. In marshalling this material, the book goes beyond an historical account of computer vision’s origins: it weaves together social history, computer science, and visual theory to illuminate a complex ontology of the digital image, underscoring the relevance of early computer vision for today’s computational visuality.  

The Birth of Computer Vision is organized into chapters that illuminate phases of research in which various paradigms and concepts were concretized. It begins with an analysis of Frank Rosenblatt’s Perceptron, which used nascent machine learning techniques to address machine perception in 1957 at the Cornell Aeronautical Laboratory. Here, Dobson shows the crucial role of cybernetics in offering a common frame of reference to integrate perspectives from biology, computing, and psychology. Chronicling the rise and fall of the Perceptron, he identifies the template for the “hype cycle of machine learning and artificial intelligence technology” we recognize today. Crucially, Dobson positions the perceptron as an early experiment of a “digital brain model,” tracing the accretion of prior norms and historical positions into the models that underpin what is now termed Artificial Intelligence (from ChatGPT to Dall-E3).       

If the nascent postwar field of computer vision was highly multidisciplinary and epistemologically in flux, its funding model and applications were not: Dobson details precisely how military funding oriented computer vision at its inception. Chapters 2 and 3 explore how a dramatic increase in the production of aerial imagery during the Vietnam War triggered a series of paradigm shifts, from “automatic photointerpretation” to “image understanding” to what would ultimately become “computer vision” in the 1980s. This involved abandoning a computationally inefficient model of perception that treated the photograph as a monolithic whole, in favor of segmenting the image into subsections that could be paired with human text annotations. The consequences of this epistemological pivot from “image” to “scene understanding” for researchers was significant: rather than learning from the implicit regularities in the training samples, these older models encoded more directly their (dominant) view of the objects they sought to locate, e.g., what patterns constitute an airplane, a threat, a territory. Dobson ultimately shows how an understanding of topological features of aerial photographs of Vietnam would ultimately inform the “mapping” of human faces in the development of facial recognition algorithms, such as Viola Jones or Eigenfaces. 

In a final case study centered on “Shakey the Robot,” a semiautonomous device capable of navigating hostile territory, funded by US Department of Defence agencies from 1966 to 1972, the trajectory of machine vision shifts from disembodied vertical sight (under the logic of aerial photography) to an embodied and mobile perspective. Dobson shows how the movements of the robot required an agile understanding of spatial coordinates that sparked new developments in line detection algorithms, alongside new debates about the limits and future of artificial intelligence. Dobson uses Shakey’s development to illuminate the extraordinary dynamism of concepts and methods in computer vision, through an account of the transit of the Hough Transform algorithm from the field of physics, to the needs of the computer vision lab.  

One of the book’s major accomplishments as we have noted, is to reconstruct the historical context of computer vision’s formative years, and the influence of military funding on the nascent technology. Dobson shows the efforts of the scientists who benefited from military funding to renegotiate the scope of their work to make room for fundamental research in an increasingly legally constrained context that forced them—following the 1969 Mansfield amendment—to produce research that has a direct relationship to military operations. He also documents the growing resistance to the militarization of science during the Vietnam war, and how, under pressure, the computer vision labs directly involved in military research left the campuses of Stanford and Cornell—echoing more recent efforts by tech workers to contest the weaponization of artificial intelligence, e,g. protests in 2018 by employees against Google’s involvement Project Maven. But as Dobson remarks, we should refrain from hasty conclusions, as this victory was rather ambiguous: the labs continued their research, this time under direct supervision from the military.  

A further contribution of The Birth of Computer Vision concerns its careful mapping of the shifting ontology of the image in computer vision, likely to become an essential reference for researchers of computational visual culture. Dobson illuminates the contours of an ontology entangled with military concerns made of irresolvable tensions regarding photography’s contested relationship to the “real.” The initial hope of the group of researchers who came to define computer vision was quickly frustrated; as Dobson rightly asserts, “The source objects of computer vision are not the world but rather a representation of it.” Dobson highlights how this tension increased when the field embraced abstract features extracted from images as second order intermediaries of indexical ambivalence. Dobson also emphasizes how computer vision’s ontology of the image is indebted to aerial photography. Embedded in the military perspective, computer vision follows an iterative process of learning from the battlefield, in which vision is a largely two-dimensional problem produced by the collapse of space in top-down aerial surveillance, but subject to troubling aspects of mediation, such as labelling and classification. Moving back and forth from the quest for indexicality and ground truth to a proliferation of abstractions, computer vision elicits a recursive relation to the real where “reality increasingly becomes augmented by the products of computer vision, which themselves are increasingly models of that now augmented reality.”  

Overall, The Birth of Computer Vision is a rich study that clearly contributes to an understanding of the historical context of computer vision’s early formation. Dobson envisions his research on computer vision as a contribution to the field of critical algorithm studies which endeavors to problematize the social implications of computational systems. He provides a useful vocabulary and framing, a larger reflection on the ontology of the image in computational culture, an important critical inquiry into the dynamics of power between the military and science, and a nuanced understanding of the cold war-era production of knowledge. Dobson also enters into the intricacies of teaching machines how to see, exploring the core epistemic problem of computer vision within the multi-layered contexts in which it is posed. Less exemplary is the coda in which Dobson seems more eager to ground his work in established concepts (for example, Harun Farocki’s [2004] “operational image”) rather than fully expand on his highly promising theoretical elaboration.  

The Birth of Computer Vision will be relevant to researchers in many disciplines, from computer science to media and cultural studies, and science and technology studies. Thanks to the clarity of Dobson’s style, the book has the potential to speak to anyone interested in the history of the computational tools they use—and are subject to—every day. 

Exit mobile version