the world wide web was the last great technology of the 20th century. for many of us, those of us who grew up on the computer, and online, it's not just a tool we use, but an essential part of us. part of our identity, and part of our mind
I consider myself chiefly a digital citizen, of digital culture, before any national or regional culture comes into picture. the internet, or more correctly the web, is where I meet friends and partners, learn most of what I know, find employment and community, and live much of my life
I'm plenty outgoing when I travel, but my most enriching experiences are always meeting people I know from online. we have a common cant and make common cause. we're dyed in the same tank. a lifestyle like the old aristocrats or freemasons, traveling merchants, university men. every port of call is a new and strange place, but there's always a bit of home, where you can slip the mask you show the world, and talk for real, with people who are like you in a way that no one not part of the group can ever understand
before there was the web, there were the men who imagined things like it, and chief among them were vannever bush and joseph licklider. the other week I wrote a bit about arpa and xerox parc, and the role licklider's vision of computing played in both
also, like virtually everyone in our part of twitter, I've been playing with chatgpt
these two facts made me want to reread the seminal articles that inspired the people who built the net and the web, namely bush's "as we may think" and licklider's "man-computer symbiosis." written in 1945 and 1960 respectively, both of them were extraordinarily visionary in what they imagined computing could someday look like. but what struck me most was the tech they imagined that still hasn't been made
in one way, both articles can be read as presaging the net, the web, multi-user computers and concurrent operating systems, wikipedia. they inspired the men who built these technologies we now live with. but they also imagine something richer: a fluid connection between man and machine, with transparent enough interface to function together as one entity. not a human using a tool, but a single cybernetic organism
I've been mostly on the ai-skeptical side for awhile, but it's impossible for anyone paying attention to not feel the ground shift in the past couple years, as llms have blown through many "hard limits" on what we thought they neural nets can do. a lot of possibilities open up with general systems that can communicate with humans, each other, and themselves, in natural language. I hope that, more than just systems for text- and code-generation, and distinct from agentic independent intelligences, we can use llms as an interface between us and the digital, to augment our awareness with the vast sea of information that we presently access through a screen darkly. and I think the vision for this springs, almost fully formed, directly from bush and licklider
in "as we may think," bush imagines a number of information storage and retrieval devices that could exist in the future, with the aim of increasing the efficiency of working scientists and researchers. today, the article is mostly remembered for his memex concept, which invented the hyperlink, and directly inspired xanadu and the world wide web. but memex was really the keystone, rather than the whole, of bush's vision: a means of pulling together various implements and information sources into a fluidly accessible exocortex. memex was supposed to be a means of enhancing perception and transmitting thought-lines, not just storing and recalling text
bush's descriptions of his devices are colorful, especially since it's 1945, so he has to assemble functionality out of analog parts that we take for granted from digital ones. he imagines that microfilm will do what the disk drive ended up doing: "Encyclopedia Britannica could be reduced to the volume of a matchbox, [...] would cost a nickel, and [could] be mailed anywhere for a cent." likewise, he hypothesizes a form of "dry photography," perhaps utilizing electron beams and photosensitive paper, that essentially fills the role of the digital camera
with these raw components, bush imagines several devices for data capture and replay. the voder, for speech synthesis; the vocoder, for speech recognition; and the cyclops camera, to record sight. and, with these, he builds something that still only exists in science fiction: the cyborg
One can now picture a future investigator in his laboratory. His hands are free, and he is not anchored. As he moves about and observes, he photographs and comments. Time is automatically recorded to tie the two records together. If he goes into the field, he may be connected by radio to his recorder. As he ponders over his notes in the evening, he again talks his comments into the record. His typed record, as well as his photographs, may both be in miniature, so that he projects them for examination.
it's worth noting that, while the article is nearly 80 years old, the tech for doing this well has only really gotten good in the past decade. speech recognition has ever suffered from a long tail problem, but openai's whisper, of 2022 vintage, approaches close to human-level english transcription. the best-ever image segmentation model, facebook's sam, just came out, like, last week, and new breakthroughs in image recognition seem sure to follow. always-on personal video recording is more of a cultural issue than a technical one now: google glass was disastrous, and snapchat glasses were a non-event, but the tech is overdue for another at-bat at breaking through the weirdness filter
now that llms show so much promise, there is strong incentive for people to record more of their lives, in expectation of finetuning models that can accurately represent themselves later on. I think in the mean time, there could be strong applications in recording yourself work, and using an llm to sort through the mass of data you produce to find things you said, did, or thought
the irony of productivity apps that force you to catalog your output yourself is that doing this extra legwork runs counter to being actually productive. I would love to dump all my thoughts and everything I read in one place, as an unorganized mass of data, and have an ai system that can query the history. locate quotes in the past based on vague descriptions, assemble summaries of texts or work periods, find connections between different contexts in which I made similar observations
this, at its core, is what memex is supposed to be. "memory expansion," a fundamentally transhuman device. the web and wikipedia are the most notable descendants of the memex idea, but really it's more like notion or roam. the problem is, all of these, including bush's concept, force the user to organize things themselves. ideally, the machine would do it for us
bush envisions memex as a desk, containing microfilm, a projector, and a scanner bed. it would store every book and article a man ever read or wished to consult, along with his "notes, photographs, memoranda, all sort of things." he describes a keyboard and a system of switches and levers meant to make moving through the corpus as swift and natural as possible. texts could be looked up by name or author, but the core of the system is "associative indexing"—in essence, the hyperlink
bush knew the brain functioned on associations, rather than indexes or categories. he imagines that a machine built on the same principle could be used as naturally as thinking itself. and he imagines people building, consulting, and sharing their own "trails" with each other, to make an argument or contribute to the commons, such that trail-building itself might become a dominant form of human discourse:
Wholly new forms of encyclopedias will appear, ready-made with a mesh of associative trails running through them, ready to be dropped into the memex and there amplified. The lawyer has at his touch the associated opinions and decisions of his whole experience, and of the experience of friends and authorities. The patent attorney has on call the millions of issued patents, with familiar trails to every point of his client's interest. The physician, puzzled by its patient's reactions, strikes the trail established in studying an earlier similar case, and runs rapidly through analogous case histories, with side references to the classics for the pertinent anatomy and histology. The chemist, struggling with the synthesis of an organic compound, has all the chemical literature before him in his laboratory, with trails following the analogies of compounds, and side trails to their physical and chemical behavior.
The historian, with a vast chronological account of a people, parallels it with a skip trail which stops only at the salient items, and can follow at any time contemporary trails which lead him all over civilization at a particular epoch. There is a new profession of trail blazers, those who find delight in the task of establishing useful trails through the enormous mass of the common record. The inheritance from the master becomes, not only his additions to the world's record, but for his disciples the entire scaffolding by which they were erected.
bush's final piece of infrastructure is the thinking machine. in his time, computers existed that could perform arithmetic and solve differential equations. bush takes this one step further, to computers that could perform mathematical proofs, and logical thought itself, freeing men to focus on the creativity beyond the ken of mere machines. how ironic that we now have models that can emulate creativity!
ultimately, the problem bush was trying to solve was scientific specialization and the deluge of data that made it necessary. from his vantage, as the de facto leader of american science, the biggest problem facing scientists and scientific progress was inability to work across disciplines or keep up with the state of the art. too much work was being produced, so men had to narrow their focus, to the point that everyone was in their own little world
thus, his aim becomes clear: a series of devices for recording perception, a machine for externalizing certain classes of thought, and a workstation to store all memories and make them easily sortable and accessible. a project to expand the capabilities of the human mind
when I first read "as you may think," I thought bush came teasingly close to the idea of the cyborg. he describes a number of external devices that aid in knowledge work; it isn't that much of a stretch to imagine them woven directly into the brain itself
but I, along with most people, had read the abridged life magazine version of his article. in the unabridged version in atlantic monthly, he makes this very leap himself:
All our steps in creating or absorbing material of the record proceed through one of the senses—the tactile when we touch keys, the oral when we speak or listen, the visual when we read. Is it not possible that some day the path may be established more directly?
We know that when the eye sees, all the consequent information is transmitted to the brain by means of electrical vibrations in the channel of the optic nerve. This is an exact analogy with the electrical vibrations which occur in the cable of a television set: they convey the picture from the photocells which see it to the radio transmitter from which it is broadcast. We know further that if we can approach that cable with the proper instruments, we do not need to touch it; we can pick up those vibrations by electrical induction and thus discover and reproduce the scene which is being transmitted, just as a telephone wire may be tapped for its message.
The impulse which flow in the arm nerves of a typist convey to her fingers the translated information which reaches her eye or ear, in order that the fingers may be caused to strike the proper keys. Might not these currents be intercepted, either in the original form in which information is conveyed to the brain, or in the marvelously metamorphosed form in which they then proceed to the hand? By bone conduction we already introduce sounds into the nerve channels of the deaf in order that they may hear. Is it not possible that we may learn to introduce them without the present cumbersomeness of first transforming electrical vibrations to mechanical ones, which the human mechanism promptly transforms back to the electrical form? With a couple of electrodes on the skull the encephalograph now produces pen-and-ink traces which bear some relation to the electrical phenomena going on in the brain itself. True, the record is unintelligible, except as it points out certain gross misfunctioning of the cerebral mechanism; but who would now place bounds on where such a thing may lead?
In the outside world, all forms of intelligence, whether of sound or sight, have been reduced to the form of varying currents in an electric circuit in order that they may be transmitted. Inside the human frame exactly the same sort of process occurs. Must we always transform to mechanical movements in order to proceed from one electrical phenomenon to another?
memex isn't really about the web or the digital encyclopedia at all, and what it's inspired so far is only a tiny fraction of what it's truly meant to be. memex is about the potential of mechanically augmenting human intelligence. of becoming one with the machine
but if there is one text that prefigures the cyborg before the term is coined, it's licklider's "man-computer symbiosis." similarly to "as we may think," "man-computer symbiosis" inspired much of the digital world we inhabit, but it imagines a future far grander and more connected than even the one we live in now. both men shot for the moon and ended up among the stars. but, I hope, we can still make it to the moon
while bush's full vision may have been lost in adaptation, licklider's is front and center:
In the anticipated symbiotic partnership, men will set the goals, formulate the hypotheses, determine the criteria, and perform the evaluations. Computing machines will do the routinizable work that must be done to prepare the way for insights and decisions in technical and scientific thinking. Preliminary analyses indicate that the symbiotic partnership will perform intellectual operations much more effectively than man alone can perform them.
The hope is that in not too many years, human brains and computing machines will be coupled together very tightly, and that the resulting partnership will think as no human brain has ever thought and process data in a way not approached by the information-handling machines we know today.
licklider's vision of the computer as a cybernetic extension of man is just as fully realized as bush's, perhaps even more so, because he envisions it happening within his lifetime. he was writing in 1960, during the first great ai summer. six years before the alpac report concluded machine translation was a dead end, and nine before marvin minksy's book perceptrons killed neural net research for over a decade. as a result, his timelines were, shall we say, rosy. but the vision he has of the potential of ai is quite beautiful (emphasis mine):
A multidisciplinary study group, examining future research and development problems of the Air Force, estimated that it would be 1980 before developments in artificial intelligence make it possible for machines alone to do much thinking or problem-solving of military significance. That would leave, say, five years to develop man-computer symbiosis and 15 years to use it. The 15 may be 10 or 500, but those years should be intellectually the most creative and exciting in the history of mankind.
the major difference between bush and licklider is that lick imagines the computers of the future as true thinking entities. he defines human tool use, including everything computers were then capable of, as "mechanically extended man." this is, he says, not what he means by symbiosis
automation is not what he has in mind either. automation, he says, is the act of giving over control of some process to some machine, and then using humans to patch over its deficiencies. a dual to tool use, what might be called human use; he labels it "humanly extended systems"
true symbiosis is two species working in concert. he admits that, someday, machines may surpass man in all regards, and then we may be mere observers, or worse, in their world. but there will be an intermediate period where we both exist with our own strengths and weaknesses, needing to rely on each other, forming partnerships worth more than the sum of their members. that is the meaning of symbiosis
I'm not sure how likely I would rate near-future systems that can act as true agents. but the core idea of weaving our minds together with theirs is, I think, a highly fruitful direction to take, and is also to me the most plausible solution to alignment. I don't think we can coordinate safety with every government and organization that might build ai systems across the world, and I don't think extreme measures to halt development and restrict civilian access to compute are worth the crippling cost to civilization. the solution, if agi truly is on the horizon, is to grow up with them, make them part of us, and think together as one complex being. and, if it isn't, the value of true human-in-the-loop thinking machines will still be vast indeed
licklider is shockingly ahead of his time in how he thinks about ai's role in the work process. for bush, a thinking machine is rigid and formal. something to utilize for calculation and logic, to free up more time for "real" thought. but lick sees ai as a companion at all stages of the process, from ideation to research to conclusion
licklider bemoans his contemporary programming paradigm, where code must anticipate all possible failure cases and handle them explicitly. the ai equivalent, not yet invented then, would be the expert systems that showed such great promise in the lisp machine era. he instead hopes for the continuum, the flexibility of modern neural nets, and he wants to bring it "into the formulative parts of technical problems" itself. he imagines "an intuitively guided trial-and-error procedure in which the computer cooperate[s] turning up flaws in the reasoning or revealing unexpected turns in the solution." not only collaborating on the answer, but working to define the question
I initially wanted to read "man-computer symbiosis" because I was thinking about a collaborative workflow with chatgpt. I'd really like to talk to it while working or studying, experiment with bringing it into my thought-loop. the autocomplete usecase is probably useful, and I want to try github copilot eventually, but the thing I really want is an assistant, a librarian, and a rubber duck that talks back. I thought licklider would have interesting suggestions for how to approach this; in fact, he describes my hypothetical workflow almost exactly
licklider, as an experiment, tracked how he spent his time while he was working, and found very little of it was actually spent thinking. most of his efforts were spent on "activities that were essentially clerical or mechanical: searching, calculating, plotting, transforming, determining the logical or dynamic consequences of a set of assumptions or hypotheses, preparing the way for a decision or insight." he also laments that which problems he would tackle to begin with were "determined to an embarrassing extent by considerations of clerical feasibility, not intellectual capability"
the first role he imagines for the machine is performing these clerical tasks for him—"information-retrieval and data-processing"—a helper that could handle the rote work that consumed most of his days. but what's shocking to me is how close he gets to the chat interface. in his time, computers could barely do some of these tasks, and many they could not do at all. there wasn't a suitable display for plotting, there was no means of storing even a small library's worth of text. xerox parc tackled much of this work, of display and interface, chasing the dream he had
but what licklider has in mind is not just the ability to perform these tasks at all. he imagines a machine that can respond to fuzzy or uncertain requests, tentative suggestions, and vague queries. he imagines, in short, the chat:
Men will set the goals and supply the motivations, of course, at least in the early years. They will formulate hypotheses. They will ask questions. They will think of mechanisms, procedures, and models. They will remember that such-and-such a person did some possibly relevant work on a topic of interest back in 1947, or at any rate shortly after World War II, and they will have an idea in what journals it might have been published. In general, they will make approximate and fallible, but leading, contributions, and they will define criteria and serve as evaluators, judging the contributions of the equipment and guiding the general line of thought.
The information-processing equipment, for its part, will convert hypotheses into testable models and then test the models against data (which the human operator may designate roughly and identify as relevant when the computer presents them for his approval). The equipment will answer questions. It will simulate the mechanisms and models, carry out the procedures, and display the results to the operator. It will transform data, plot graphs ("cutting the cake" in whatever way the human operator specifies, or in several alternative ways if the human operator is not sure what he wants). The equipment will interpolate, extrapolate, and transform. It will convert static equations or logical statements into dynamic models so the human operator can examine their behavior. In general, it will carry out the routinizable, clerical operations that fill the intervals between decisions.
in the second half of his paper, licklider lists off all the work yet to be done. much of it was solved by his direct successors, or other figures in the field. he says that computers are too fast for one man to make proper use of, and proposes timesharing operating systems and a userspace/kernelspace distinction, to allow them to be fully utilized. he goes on to suggest networking computers together—a prequel tease for his work on arpanet—so they may share resources more broadly
he notes that storage is so expensive that computers must be able to interface with analog substitutes like books and tape if they're ever to store nontrivial amounts of information. he complains that volatile memory is likewise expensive, and too slow compared to compute, creating a massive bottleneck on serious work. the integrated circuit and moore's law have largely solved these problems
he also describes problems that have not yet been solved. but they are problems that may well be handled by contemporary ai
language is the biggest concern. licklider notes that the language barrier between humans and machines is a deep conceptual divide:
[Computer language] specifies precisely the individual steps to take and the sequence in which to take them. [Human languages] present or imply something about incentive or motivation, and they supply a criterion by which the human executor of the instructions will know when he has accomplished his task. In short, instructions directed to computers specify courses; instructions directed to human beings specify goals.
and what he wants is for the computer to be able to take direction more in the manner of the human. like bush, who suggests a restricted technical vocabulary may be needed for the vocoder, licklider believes the solution is to meet in the middle. he cites algol in particular—the language that pioneered the use of context-free grammar in designing and describing syntax—as a promising development. but more work, he believes, is needed. he offers two suggestions
one is quite strange to me, and sounds like a compositional semantics for programming: "real-time concatenation of preprogrammed segments and closed subroutines," little snippets of code called by name. he imagines "computer programs that can be connected together like the words and phrases of speech to do whatever computation or control is required at the moment." I'm not sure how practical it would be, nor am I sure that anything like this really exists. picturing it gives me a quirky and whimsical feeling. something like playing apl on a piano
but the other should be quite familiar to us: "problem-solving, hill-climbing, self-organizing programs." "hill-climbing" is a technical term for adjusting one value in a vector and testing whether it approaches closer to a solution. if you were to adjust all the values at once, you would have one of our best modern friends: gradient descent
the final obstacle that licklider sees in the way of symbiosis is the same as bush, namely, the physical peripherals used to interact with computers. in licklider's day, edge devices were little more than teletypes and oscilloscopes. better than bush, who pined for machines that could "take instructions and data from a roomful of girls armed with simple keyboard punches [and] deliver sheets of computed results every few minutes." but even so, lick asserts that mere "electric typewriters" could never have the "flexibility and convenience of the pencil and doodle pad or the chalk and blackboard used by men in technical discussion"
and his suggestions end up following in the same path as his predecessor: a desk, for display and control. handwriting recognition, including the ability to make schematic sketches a computer can interpret as code. computer-posted wall displays, a shared whiteboard that interfaces with each man's terminal. and speech synthesis and recognition, to talk to the machine
both bush and licklider had breathtaking visions of the computer, not just as a tool for data storage and information processing, but a fluid extension of the self. both men imagine systems with vastly more natural interfaces for command and data retrieval than exist today. and both propose that, as the interface becomes faster and the feedback loop tigher, we will come to feel them as part of our mind
to condense down their visions into a few bullets, I would say both of them approach, in their own way, these three key ideas:
the computer should be able to handle calculation, data retreival, plotting, and other computer-strong tasks on your behalf. it should behave less like a tool and more like an assistant. you express what you want it to do, and it figures out how to do it
the computer should provide a lot of help in finding information, to the point of understanding your intent and responding to vague requests and half-formed thoughts. you express what kind of thing you might want, and it works with you on the specifics
you should be able to talk to the computer and get responses, like a real dialogue. this is the central mechanism that accomplishes the first two points
but neither man truly imagined a system that could read, hear, interpret, and speak natural english, with perfect clarity, almost at the speed of thought. that's what makes the present moment unique: the possibility of the universal interface
I'm not an ai person. merely a humble writer and programmer, watching how these systems develop, trying to think of ways to use them in my own work. I don't have any strong conclusions about where we might go. all I know is that these possibilities are open to us, to fulfill the dreams of those who came before
This was a treat to read! After reading this and your other work I've been thinking about GPT as a horse for the mind (similar to Steve Jobs PC as bicycle for the mind). It's less precisely controllable but very versatile, with its own "goals". I think this dovetails nicely with Licklider's symbiosis vs mechanical extension idea. What would be the analogy for a "horse culture" in the GPT world? Cowboys and their steeds, ranging across cyberspace, masters of their own destiny.
Anyway sorry for the ramble. Thanks for writing cool stuff!
Vannevar slay