A few years ago, Adobe Photoshop and Illustrator were the new computer based graphic design tools. My nephew, 10 or 12 at the time, learned to use them enough to illustrate what new dormer windows on the roof of his parents’ house would look like to which his mother responded by wondering out loud if my nephew could do that, what does that spell for the future of his uncle’s profession? A concern one would have if you thought of an architect as a drafter— meaning the tool not the agent of design.
Another graphic tool, this one claiming to be a “generative artificial intelligence” (GAI) has emerged over the last year or so. By typing in a prompt such as “realistic purple unicorn digitally rendered” the program (simplistically put) cranks through billions of images it gathers from the internet “searching” for patterns of correlation among words and images, then diffuses or disassembles images into bits to then re-assemble those bits into a “new” image. There are now three graphic generating applications competing for world dominance-- DALL-E-2, Midjourney and Stable Diffusion. Stable Diffusion is, in Silicon Valley terminology, a “unicorn” -- meaning a start-up worth $1 billion.
As architects we already use tools such as Google Images and Arch Daily for research and presentations to find images with which we might illustrate a design intent or idea before having completed a design—because, well, a picture is worth a thousand words. Without having practiced much yet with image GAI it is probably safe to say that it may have additional utility for architects but for what exactly it’s too early to tell—probably presentations.
And yet despite the achievement’s modesty we already witness the usual fanfare about how yet another new digital technology will change everything we have ever known about everything everywhere since the beginning of time. We endure breathless proclamations that image GAI will democratize, displace, or even replace the “creative class”—that it spells the end of artists and art.
There are reasons to feel both skeptical and déjà vu all over again. We have heard this once before when in the Industrial Revolution of the 19th century we automated manufacturing and mass-produced stuff—images and objects—and in response the art world fabricated a self-defeating crisis over what constituted the authenticity of Art, or Craft, or how will we ever build Architecture that’s legitimate for Our Time? It was a disruption, one from which we are still recovering, but an unnecessary one, from which we will recover.
Artists will somehow benefit from image GAI especially when used in combination with other tools. But in the 21st century as in the 19th century it’s a stretch to claim that it will replace artists. It's not even clear that it will be a good tool. In our practice we know that digital tools, as helpful as they can be on certain prescribed tasks associated with design, can also have numbing effects on design. In the face of the cognitive load required to run them they subjugate the mind to the complexities of the tools at the expense of design. They impose passivity and diminish our psychological agency rendering us uncritical and inert, all too ready to accept the generic nature of their output (“Revit made me do it”).
As a graphic tool this one is a little less cumbersome than others. You type in words (or rather the right words) and out comes an image (albeit a somewhat arbitrary one). But words and images are slippery. As voices as disparate as Aby Warburg and Erwin Panofsky, the patriarchs of 20th century art history, I.A. Richards, the 20th century philosopher of rhetoric and John McWhorter the 21st century linguist (not to mention our own life experiences and common sense) have consistently reminded us: images and words mean different things to different people in different contexts at different times. Context matters, it’s constantly in flux, past is never prologue.
A machine that gathers and rehashes relationships of words and images from the past found on the internet can only generate statistical averages of pre-existing relationships and meanings. It doesn’t “learn from”, “understand” or even “imitate” anything from which it could profoundly generate new meaning or anything that, let’s be real, we haven’t already seen. It instead generates hodgepodge —unthoughtful, indiscriminate maybe momentarily novel, funny, pretty, heart-warming, cute, sad, scary but ultimately with repetition, boring mash up.
Jaron Lanier, a Silicon Valley pioneer and it’s perennial skeptic has referred to image GAI as “image synthesizing”—a term suspiciously like the title of architectural theorist Christopher Alexander’s 1964 reductionist manifesto THE SYNTHESIS OF FORM. In it Alexander posits that we could (and should) gather formal prototypes (shapes and functions) from the history of architecture and by recognizing patterns in that history generate through their systematic synthesis “authentic” architectures independent of what he called the architectural “priesthood” (meaning egotistical modernists)—thus eliminating the agency of the architect (the algorithm replacing the practitioner as the high priest).
It is easy to see how claims for generative AI and especially the graphic and drafting tools that emerge from it could be mistaken for that same outcome even as architects (and human beings) we know that this will never come to pass.