{"slug": "history-of-t", "title": "History of T", "summary": "Olin Shivers recounts the history of T, a Lisp implementation developed at Yale in the early 1980s by Jonathan Rees, Dan Weld, and himself, which set a standard for clean design. The project emerged from a small group of students exploring functional programming and Scheme, challenging the prevailing belief that lexical scoping was inefficient. T was built during the transition to 32-bit machines, addressing the limitations of earlier Lisp systems on the PDP-10.", "body_md": "*(T was one of the best Lisp implementations, and set a standard for clean design\nthat few newer dialects have been able to meet. Here Olin Shivers recounts T's\nhistory.)*\nAround 1981-1982, the Yale CS dept., which had a strong AI group led by Roger\nSchank, hired undergraduate\n[Jonathan Rees](http://www.mumble.net/jar/)\nto implement a new Lisp for their\nresearch programming. Jonathan, I and Dan Weld (now a prof. at U Washington)\nwere the three people at Yale that had discovered the early Sussman/Steele\n\"lambda\" papers, including Guy's seminal master's thesis on Rabbit, the first\nScheme compiler. Dan was a college senior; Jonathan & I were juniors. Alan\nPerlis, the soul of the department, had just discovered functional\nprogramming, and was running a graduate seminar covering early FP languages\nsuch as Hope & Miranda & Scheme. The three of us managed to sleaze our way\ninto this grad class, where we met each other.\nSome context: Common Lisp did not exist (the effort was just getting\nunderway). MIT Scheme did not exist. Scheme was a couple of AI Lab tech\nreports and a master's thesis. We're talking the tiniest seed crystal\nimaginable, here. There was immense experience in the Lisp community on\noptimising compiled implementations of dynamically-scoped languages — this,\nto such an extent, that it was a widely held opinion at the time that \"lexical\nscope is interesting, *theoretically*, but it's inefficient to implement;\ndynamic scope is the fast choice.\" I'm not kidding. To name two examples, I\nheard this, on different occasions, from Richard Stallman (designer &\nimplementor of Emacs Lisp) and Richard Fateman (prof. at Berkeley, and the\nprincipal force behind Franz Lisp, undoubtedly the most important Lisp\nimplementation built in the early Vax era — important because it was\ndelivered and it worked). I asked RMS when he was implementing Emacs Lisp why\nit was dynamically scoped and his exact reply was that lexical scope was too\ninefficient. So my point here is that even to people who were experts in the\narea of Lisp implementation, in 1982 (and for years afterward, actually),\nScheme was a radical, not-at-all-accepted notion. And *outside* the Lisp/AI\ncommunity... well, languages with GC were definitely not acceptable. (Contrast\nwith the perl & Java era in which we live. It is no exaggeration, thanks to\nperl, to say in 2001 that *billions* of dollars of services have been rolled\nout to the world on top of GC'd languages.)\nJonathan had spent the previous year on leave from Yale working at MIT. The\nimportant thing that was happening was that 32-bit machines were coming out,\nwith 32-bit address spaces — *big* address spaces. A lot of the existing\nlanguage technology in the AI community had been developed for the PDP-11\n(16-bit machine) and, more importantly, the workhorse PDP-10 and -20. I loved\nthe \"ten,\" may I add. It had an instruction set that fit onto a single page of\nlarge type, and was just cool. That ISA was a hacker's dream; you could play\nall kinds of fun games with it. For example, there was a famous hack that\nprovided a means of (1) removing a cons cell from a freelist, (2) updating the\nfreelist, and (3) branching if the freelist was exhausted to the GC... in *one\ninstruction*. The PDP-10 was a 36-bit machine, with an 18-bit word-addressed\naddress space. Note what this means: a cons cell fit into a single word. There\nare many who claim that the -10 was the world's first Lisp machine. I agree\nwith them.\nThere were two extremely good, mature, highly optimised Lisp implementations\nfor the -10, one \"East Coast\" (Maclisp, from MIT) and one \"West coast\"\n(Interlisp, from Stanford & Xerox PARC). You could also program the -10 in\na beautiful, roughly-C-level language from CMU, called Bliss. I see C and I\nremember Bliss, and I could weep.\nThe problem was the limited, 18-bit address spaces of the -10's. Programmers\nwere blowing them out. When DEC shipped the Vax & Motorola 68000s began to\nshow up in Sun & Apollo workstations, people realised that the 32-bit address\nspace of these architectures was a discontinuous shift in technology, and that\nlanguage implementation on these machines was going to be similarly\ndiscontinuous. For example, with a really big address space, you wanted to\nfundamentally change your GC technology and data representations.\nBerkeley was a big player making the Vax happen in universities, by getting\nthe ARPA contract to build Berkeley Unix for the Vax (which effort\nsubsequently spun off into Sun, courtesy Bill Joy). Part of this effort was a\nLisp for the Vax, Franz Lisp, built under Fateman's guidance. Franz was a\ndesign more in the vein of Maclisp than Interlisp, enough so as to allow the\nporting of Macsyma (Fateman's interest) to the Vax. Franz also showed\nfundamental influences from a little-known Lisp done at Harvard.\nMIT responded to the Vax by kicking off the NIL project. NIL stood for \"New\nImplementation of Lisp.\" Jonathan was part of this project during his year\naway from Yale. It was a really, really good effort, but in the end, was\ncrippled by premature optimisation — it was very large, very aggressive, very\ncomplex. Example: they were allocating people to write carefully hand-tuned\nassembly code for the bignum package before the general compiler was written.\nThe NIL project was carried out by top people (err... I recall Jonl White &\nGeorge Carrette being principals). But it never got delivered. It was finished\nyears later than projected, by which time it was mostly irrelevant. (This has\nhappened to me. It's a bitter, bitter experience. I fashionably decried\npremature optimisation in college without really understanding it until I once\ncommitted an act of premature opt so horrific that I can now tell when it is\ngoing to rain by the twinges I get in the residual scar tissue. Now I\nunderstand premature optimisation.) The genesis & eventual failure of this\nkind of project is always clearly visible (in hindsight) in the shibboleths of\nthe early discussions. One key tip-off phrase is always something of the form,\n\"We'll throw out all the old cruft, start over fresh, and just Do Things\nRight.\" (This, unfortunately, is not a useful observation, because that\nstrategy sometimes does pay off, hugely. It's just very risky.)\nJonathan worked on NIL for a year, then came back to Yale for his senior year,\nwhere he was hired by the CS dept. to implement a new Lisp. He made a\n*radical* decision: he was going to do an optimising, native-code *Scheme*\nsystem. He chose to name it T. This was a great name for a couple of\nreasons. It was short & simple, of course. It fit in with Yale CS culture, as\nthere was a history of programs developed there that had single-letter names:\ne, c, z & u. (These were a locally-grown family of sophisticated screen\neditors that were roughly comparable to, but quite different from, Emacs.)\nFinally, if you're a Lisp hacker, then you know that NIL is the Lisp false\nconstant... and T is the canonical true constant. So \"T is not NIL.\"\nLet me repeat here what a radical decision it was to go and build a Scheme.\nThe *only* Scheme implementation that had *ever* been built at this point was\nthe research prototype Steele had done for his Masters. *All* serious Lisps in\nproduction use at that time were dynamically scoped. No one who hadn't\ncarefully read the Rabbit thesis believed lexical scope would fly; even the\nfew people who *had* read it were taking a bit of a leap of faith that this\nwas going to work in serious production use — the difference between theory &\npractice is, uh, larger in practice than in theory.\n(For example, the other big MIT implementation effort, Zetalisp for the Lisp\nMachine, kept dynamic scoping, but allowed the compiler to sort of break the\nsemantics, and then, in response to Scheme, threw lexical closures into the\nmix as a fairly kludged-up special form.)\n(The Europeans working on early systems like ML in Edinburgh probably find all\nthis American early-80's thrashing & confusion over scoping discipline and\nimplementation strategy incredibly clueless. Sorry 'bout that.)\nBesides Roger Schank, the other person who made the resource commitment to\nhire Jonathan to develop T was John O'Donnell, who later went on to be a\nprincipal Multiflow, the company that commercialised VLIW\narchitecture/compiler technology that Josh Fisher spun out of Yale in 1983. I\nsuspect Alan Perlis probably had a hand in the decision, as well, though I\ndon't really know. Committing funds to allow Jonathan to set out to (try to)\nbuild a production-quality Scheme implementation was pretty brave, up there\nwith Jonathan's decision to try.\nJonathan brought back to Yale from the NIL project a raft of really excellent\nimplementation technology — primarily the fundamental data representations\nthat were carefully honed for the new-generation machine architectures, using\ntag bits in the low bits of the datum. E.g., if you made the fixnum tag \"000\",\nthen you could add & subtract fixnums in a single instruction, with no\ntag-hacking overhead; you could multiply with a single pre-shift and divide\nwith a single post-shift. This was a big improvement over Maclisp's required\nboxing of fixnums, and the supporting cruft that made that all work (in my\nopinion). Also, since the Vax was *byte* addressable, you could strip off the\ntype tag of a cons-cell datum simply by adjusting the constant offset in the\naddressing mode. I.e., cons cells were represented by the double-word-aligned\naddress of the two-word chunk of memory where the cell's car & cdr fields\nlived. Double-word-aligning the memory block means the low three bits of its\naddress are always zero, that is, not needed. So the three low bits of the\naddress were used for the type tags. So suppose we use \"010\" (decimal 2) for\nthe type tag. You could take the cdr of the pair in register r7 with a single\ninstruction: load r8, r7[4-2] where the \"4\" gets you 1 word (4 bytes) into the\npair (the cdr field) and the \"-2\" corrects for the type tag. I.e., you could\nstrip off the type tag with *zero* run-time penalty. Nice! The representations\nfor closures and stack frames were also very clever.\nJonathan had been burned by the NIL project's failure to complete, so he was\nvery careful about avoiding premature optimisation. So he blasted out a quick\n& dirty prototype implementation just to get something up and running. (I\nthink he wrote this in Maclisp, and as I recall, he called it \"cheapy.\") After\nthat, all development of future implementations was done in T — T 2 was\nimplemented with T 1.\n2 was the first really good implementation, with all of the tricks I've\ndescribed above. It ran on Vaxes & 68000's, which had also just come out.\nIt was solid enough to be a serious system that had real clients who depended\non it.\nAbout this time, roughly, Sussman's group was starting the development path\nthat eventually led to MIT Scheme, and the (intertwined) pedagogical path that\nled to Sussman & Abelson's book, *Structure and Interpretation of Computer\nPrograms*. The Lisp Machine effort had also spun out into Symbolics & LMI,\ncausing Maclisp to spawn Zetalisp & Flavors, which in turn had a lot of\ninfluence on Common Lisp and Common Lisp's object system, CLOS. But I'm\ndigressing. Back to Yale.\nT also used a pretty cool GC. Maclisp on the -10 had used a mark&sweep GC (one\nversion of which famously \"ran in the register set,\" though that is another\nstory), encoding type information using a \"BIBOP\" scheme — all objects\nwere boxed, and segregated by type into pages. Hence the high bits of the\nobject's address could be used to index into a page table to tell you what\nkind of thing lived in that page. This was well tuned for tight-memory systems\nlike the -10. With large address spaces, though, you wanted to use stop©,\nbecause with stop© you only pay to copy the live data; you don't pay a\ncost proportional to the amount of garbage. This is well suited to the big\nheaps you can allocate on a 32-bit machine. Most stop© collectors almost\nuniversally implement the Cheney algorithm, which does a breadth-first\nsearch of the heap. But BFS is not so great for memory locality — it scatters\ntopologically close data structures all over the heap as it copies. Not good.\nT used a lesser-known (but quite simple — the research paper describing it is\nabout 2 pages long) algorithm due to Clark that implements *depth*-first\ntraversal. (Just as the Cheney algorithm cleverly uses the existing heap data\nstructures to provide the BFS search queue, Clark's uses the heap to provide\nthe search stack.) Depth-first search means that if you GC a linked list, the\nGC zips down the spine of the list before turning its attention to the\nelements of the list, so those \"spine\" cells wind up laid out sequentially in\nmemory. Your list turns into a vector! (sorta) This *rocks* for locality.\nHowever,\n- T dropped this algorithm in the late 80's for the classic BFS algorithm.\nDavid Kranz (who will appear in this tale shortly) told me at that point\nthat he'd made the switch because the copy phase of the BFS algorithm had\nslightly faster constant factors\n- That standard religion I just gave you about \"stop© only pays\nfor the good stuff, but in mark&sweep you have to pay for the garbage,\nas well\"? It's not true. We all believed it for decades. But Norman\nRamsey at harvard has cleverly shown that you can implement mark&sweep\nwith *exactly* the same asymptotic costs as stop©. This is good\nnews especially for tight-memory systems with homogenous heap data.\nNorman's observation is really obvious and simple; hardly an impressive\nresult when you see it. Except, uh, that it eluded *everyone else* for\n*decades.* And not because people didn't care; GC has received a lot of\nattention from researchers. There's a lesson there.\nI've never seen a depth-first collector anywhere but T. By the way, the\nT garbage collector was written in T. This is also a slightly amazing feat.\nIt was achieved by virtue of the fact that T was native-code compiled, and\nthe garbage collector was written by the compiler authors. They knew *exactly*\nhow the compiler would handle their source, so they could carefully code the\ncollector so that it would not need to heap allocate while running.\n(That's not as simple as it sounds. It's not as easy as simply writing your\ncode and never calling malloc() or invoking a \"new\" method. It's tied up in\nthe treatment of lambda. Good Scheme compilers use a range of implementations\nfor the lambdas in the program, depending upon what they can determine about\nthe lambdas at compile time — how they're used, to where they are passed, the\nrelationship between the uses and the definition points, etc. Some lambdas\njust evaporate into nothing. Some lambdas turn into control-flow join points\nwith associated register/variable bindings. Some lambdas turn into stack\nframes. But some lambdas cause heap allocation to produce general closures. So\nyou have to understand how the compiler is going to handle every lambda you\nwrite. And the fundamental skeleton of a Scheme program is built on lambda.)\nAnother implementation feat of T's was that it allowed interrupts between\n*any* two instructions of user code. This placed a pretty intense burden on\nthe compiler, enough so that, of all the Scheme implementations of which I'm\naware, T is *unique* in this respect. To understand why this is hard in the\npresence of garbage collection, you can read a paper I wrote on the subject ten\nyears later, \"Atomic heap transactions and fine-grain interrupts,\" found at\nhttp://www.cc.gatech.edu/~shivers/citations.html#heap\n(You don't have to be a heavy-duty lambda-calculus wizard to read this\npaper; it's written to be comprehensible to general hackers.) T also allowed\nyou to write interrupt (Unix signal) handlers in T, which was pretty pleasant.\nThere was more to T than implementation technology; there was also a lot of\nbeautiful language design happening. Jonathan seized the opportunity to make a\ncomplete break with backwards compatibility in terms of the runtime library\nand even the names chosen. Somewhere in the T 2 effort, Kent Pitman, another\nLisp wizard, came down to Yale from MIT. He and Jonathan poured an immense\namount of design effort into the language, and it was just really, really\n*clean*. Small (but difficult) things: they chose a standard set of lexemes\nand a regular way of assembling them into the names of the standard\nprocedures, so that you could easily remember or reconstruct names when you\nwere coding. (I have followed this example in the development of the SRFIs\nI've done for the Scheme community. It is not an easy task.)\nLarger, deeper things: they designed a beautiful object system that was\nintegrated into the assignment machinery — just as Common Lisp's SETF lets\nyou assign using accessors, e.g., in Common Lisp\n(setf (car x) y)\nis equivalent to\n(rplaca x y)\nin T,\n(set! (car x) y)\nwas shorthand for\n((setter car) x y)\nAccessor functions like CAR handled \"generic functions\" or \"messages\"\nlike SETTER — CAR returned the SET-CAR! procedure when sent the SETTER\nmessage. The compiler was capable of optimising this into the single\nVax store instruction that implements the SET-CAR! operation, but the\nsemantic machinery was completely general — you could define your own\naccessor procedures, give them SETTER methods, and then use them in SET!\nforms.\n(This turned out to be very nice in the actual implementation of the compiler.\nThe AST was a tree of objects, connected together in both directions —\nparents knew their children; children also had links to their parents.\nIf the optimiser changed the else-part of an if-node N with something\nlike this\n(set! (if-node:else n) new-child)\nwhich was really\n((setter if-node:else) n new-child)\nthe if-node:else's SETTER method did a lot of work for you — it disconnected\nthe old child, installed NEW-CHILD as N's else field, and set NEW-CHILD's\nparent field to be N. So you could never forget to keep all the links\nconsistent; it was all handled for you just by the single SET! assignment.)\nAround the time that Kent went back to MIT, new grad student Norman Adams\nhooked up w/Jonathan. T 2 and its compiler TC, was produced after about a year\nof really hard, focussed work on the part of Jonathan, Kent and Norman. I\ngraduated from Yale and went off to CMU to be a grad student in AI. Jonathan\nstarted to think about the next compiler.\nDuring my first year as a grad student, Jonathan met Forrest Baskett, who was\nthe director of one of the top industrial CS labs, DEC's Western Research\nLab, where a lot of the important RISC work was done (e.g., you could\nargue that David Wall's work on interprocedural register allocation there\nkilled the architectural feature of overlapping register-set stacks that came\nout of Berkeley and wound up in the SPARC). Forrest liked Jonathan, and\ninvited him to bring a team out to WRL for the summer to implement T for the\nmachine they were building (an amazing-for-the-time RISC called the Titan).\nJonathan's team was himself, Norman, Jim Philbin, David Kranz, Richard Kelsey,\nJohn Lamping and myself. Lamping was at Stanford, I was at CMU, the rest were\ngrad students at Yale (except Jonathan, who was an employee at Yale).\nThis brings us to the summer of 1984. The mission was to build the world's\nmost highly-optimising Scheme compiler. We wanted to compete with C and\nFortran. The new system was T3, and the compiler was to be called Orbit. We\nall arrived at WRL and split up responsibility for the compiler. Norman was\ngoing to do the assembler. Philbin was going to handle the runtime (as I\nrecall). Jonathan was project leader and (I think) wrote the linker. Kranz was\nto do the back end. Kelsey, the front end. I had passed the previous semester\nat CMU becoming an expert on data-flow analysis, a topic on which I completely\ngrooved. All hot compilers do DFA. It is necessary for all the really cool\noptimisations, like loop-invariant hoisting, global register allocation,\nglobal common subexpression elimination, copy propagation, induction-variable\nelimination. I knew that no Scheme or Lisp compiler had ever provided these\nhot optimisations. I was burning to make it happen. I had been writing 3D\ngraphics code in T, and really wanted my floating-point matrix multiplies to\nget the full suite of DFA optimisation. Build a DFA module for T, and we would\ncertainly distinguish ourselves from the pack. So when we divided up the\ncompiler, I told everyone else to back off and loudly claimed DFA for my own.\nFine, everyone said. You do the DFA module. Lamping signed up to do it with\nme.\nLamping and I spent the rest of the summer failing. Taking trips to the\nStanford library to look up papers. Hashing things out on white boards.\nStaring into space. Writing little bits of experimental code. Failing. Finding\nout *why* no one had ever provided DFA optimisation for Scheme. In short, the\nfundamental item the classical data-flow analysis algorithms need to operate\nis not available in a Scheme program. It was really depressing. I was making\nmore money than I'd ever made in my life ($600/week). I was working with\n*great* guys on a cool project. I had never been to California before, so I\nwas discovering San Francisco, my favorite city in the US and second-favorite\ncity in the world. Silicon Valley in 1984 was beautiful, not like the crowded\nstrip-mall/highway hell hole it is today. Every day was perfect and beautiful\nwhen I biked into work. I got involved with a gorgeous redhead. And every day,\nI went in to WRL, failed for 8 hours, then went home.\nIt was not a good summer.\nAt the end of the summer, I slunk back to CMU with my tail between my legs,\nhaving contributed not one line of code to Orbit.\nEveryone else, however, completed. The compiler wasn't finished by summer's\nend, but it was completed the following year at Yale. And it was the world's\nmost highly optimising Scheme compiler (even though it did not do data-flow\nanalysis), a record it held for a *long* time — perhaps ten years?\nIt was also a massive validation of a thesis Steele had argued for his\nMaster's, which was that CPS was a great intermediate representation for a\ncompiler. Orbit was totally hard-core about this — the first thing the\ncompiler did was translate the user program into CPS, and that was the\nstandard form on which the compiler operated for the rest of its execution.\nAnd it turned out this approach scaled up from Rabbit to a production,\nnative-code compiler very successfully.\nDavid Kranz took the work he'd done on the back end, which was a very complex\npiece of code that did a lot of sophisticated analysis on data\nrepresentations, register allocation, and, in particular, lambdas, and turned\nit into his PhD thesis. Orbit produced code that actually beat the Pascal\nimplementation used by Apollo (a Sun-class workstation company) to implement\nthe *operating system* on that workstation; that was a huge coup. David then\nwent to MIT, where he brought his compiler technology to Bert Halstead's\nparallel Lisp project, before hooking up with Steve Ward to do the research\nproject that turned into Curl. When Ward spun Curl out into a company,\nHalstead & Kranz became the senior technical guys there.\nLet's call Kranz's dissertation PhD #1. It's title was *An Optimising Compiler\nfor Scheme,* which I took to be an in-reference to William Wulf's seminal\nBliss compiler, described in a book (my copy is signed) titled simply *The\nDesign of an Optimising Compiler*. Wulf's Bliss compiler was a model up to\nwhich we all looked — it held the title \"world's most highly optimising\ncompiler\" for a while.\n(Remember Bliss? Just to add more cross-links, Wulf had left CMU about then\nand spun out a company, Tartan Labs, to commercialise this compiler technology\nfor C. He took Guy Steele with him, who had just finished wrapping up leading\nthe Common Lisp definition while on the faculty at CMU. Tartan tanked, Wulf\nmoved on to a senior position at UVa & is now a big wheel at the national\nscience-policy level, e.g. leading National Academy inquiries into\ncounter-terrorism technology. Steele went to Thinking Machines, and then\nthrew his language-development skills behind the Java effort at Sun.)\nKranz' diss is a Yale Computer Science Dept. tech report. I would say it\nis required reading for anyone interested in serious compiler technology for\nfunctional programming languages. You could probably order or download one\nfrom a web page at a url that I'd bet begins with http://www.cs.yale.edu/.\nRichard Kelsey took his front end, which was a very aggressive CPS-based\noptimiser, and extended it all the way down to the ground to produce a\ncomplete, second compiler, which he called \"TC\" for the \"Transformational\nCompiler.\" His approach was simply to keep transforming the program from one\nsimple, CPS, lambda language to an even simpler one, until the language was so\nsimple it only had 16 variables... r1 through r15, at which time you could\njust kill the lambdas and call it assembler. It is a beautiful piece of work,\nand, like Kranz's dissertation, required reading for anyone who wants to do\ncompilers for functional programming languages. It had a big influence on\nAndrew Appel, at Princeton, who subsequently adopted a lot of the ideas in it\nwhen he and Dave MacQueen's group at Bell Labs built the SML/NJ compiler for\nSML; Andrew described this in the book he subsequently wrote on that compiler,\n*Compiling With Continuations.* However, unlike the SML/NJ compiler, Kelsey's\nCPS-based compiler compiled code that used a run-time stack for procedure\ncalls. He actually describes front ends in his diss for standard procedural\n\"non-lambda\" languages such as Basic.\nSo the lineage of the CPS-as-compiler-IR thesis goes from Steele's Rabbit\ncompiler through T's Orbit to SML/NJ. At which point Sabry & Felleisen at Rice\npublished a series of very heavy-duty papers dumping on CPS as a\nrepresentation and proposing an alternate called A-Normal Form. ANF has been\nthe fashionable representation for about ten years now; CPS is out of favor.\nThis thread then sort of jumps tracks over to the CMU ML community, where it\npicks up the important typed-intermediate-language track and heads to Cornell,\nand Yale, but I'm not going to follow that now. However, just to tell you\nwhere I am on this issue, I think the whole movement from CPS to ANF is a bad\nidea (though Sabry & Felleisen's technical observations and math are as\nrock solid as one would expect from people of their caliber).\nLet's call Kelsey's dissertation PhD #2.\nKelsey subsequently spent time as a prof at Northeastern, then left for NEC's\nprestige lab in Princeton, where he worked on the Kali distributed system. He\nalso got set down precisely on paper something all the CPS people knew at some\nintuitive level: that the whole \"SSA\" revolution in the classical compiler\ncommunity was essentially a rediscovery of CPS. (Harrumph.) NEC Princeton went\non to accumulate a very impressive collection of Scheme/ML hackers: Stephen\nWeeks & Andrew Wright from Rice, Kevin Lang (who built a little known but\nquite beautiful, elegant, free and portable object-oriented Scheme called\nOaklisp), Kelsey, Jim Philbin, Henry Cetjin, and Jeff Siskind. When NEC\nPrinceton became an insane toxic place, Kelsey, like almost everyone else in\nthat previous list, jumped out into startup land, where he did a startup with\nRees & an MIT alum, Patrick Sobalvarro, who achieved some early fame for work\non GC. That startup tanked in the dotcom meltdown last year, and Kelsey's\nnow on his second startup.\nNorman Adams turned his assembler into a master's degree. It also was a cool\npiece of software. His assembler didn't take a linear text stream; the\ncompiler handed it a *graph structure*. It serialised the graph on its own to\nminimise the spans of the jump instructions, and had other neat features\n(e.g., it was actually a portable framework for building assemblers). Then he\ntook his Masters and bailed out to Tektronix, where he developed a very\nhigh-performance Scheme implementation for the Motorola 88000 called \"screme,\"\nand then went to Xerox PARC, where he worked on ubiquitous computing and a\nScheme implementation called SchemeXeroX (a joke on \"Team Xerox\") with Pavel\nCurtis. He left Xerox at the beginning of the dotcom boom and was early in at\nthe startup company Ariba, which is why (1) Ariba's big product had a\nconfiguration system that is a Scheme built in Java and (2) he's a rich\ndude.\nLamping's story is perhaps the strangest. He went back to Stanford, and got\ninvolved in a very arcane, theoretical problem called optimal lambda\nreduction, which he completely solved for his PhD. This is an achievement of\nconsiderable note because pointy-headed theoretical semanticists had been\nstruggling to crack this problem for a long time in Europe. They'd been\nstruggling so hard, in fact, that they really seemed... annoyed when this\nhacker from Stanford just sat down and solved the problem. John seemed to be\ncompletely unqualified to solve the problem, bringing nothing to it but, uh,\nbrains. There was, for example, a snooty French paper that sort of dismissed\nLamping as an \"autodidact,\" before proceeding to build (with, let me be\ncareful to note, proper credit given to John) on his work. So Lamping has thus\nbeen permanently saddled with this hilarious title/term, by those who know &\nlike him. He'll never live it down. John Lamping, autodidact.\nJohn subsequently went to Xerox PARC, where he and Gregor Kiczales made a team\nworking on a wide array of interesting programming-language problems, of which\n\"Aspect-Oriented Programming\" is the most well known. Again, the story here\ndeparts from T, so I won't pursue it. For the same reason, we will not call\nJohn's dissertation \"PhD #3\" — it wasn't really connected to his work on the\nT project.\nAbout three years after the summer at WRL, I *finally* figured out how to do\ndata-flow analysis for Scheme, which ended a long, pretty unhappy period in my\nlife. I officially switched from being an AI student to being a PL student,\npicked up Peter Lee as a co-advisor (since my original advisor, Allen Newell,\nwhile certainly the greatest scholar I've ever personally known, was not a PL\nguy), and wrote it all up for *my* dissertation. This we can call PhD #3.\nBy the way, I'll add that the deepest and most powerful part of my diss, in my\nopinion, is the part (a) about which no one seems to know and (b) which is on\nthe shakiest theoretical ground: environment reflow analysis. I would surely\nlove it if some interested character one day takes that piece of my diss and\nreally takes it someplace.\nJim Philbin, like Kelsey, also went to NEC, where he built an operating system\ntuned for functional programming languages, STING (or perhaps it was spelled\n\"STNG\" — in any event it was pronounced \"sting\"). He built it in T, of\ncourse, and one could see that it had its roots in his work on T3.\n(Implementing the runtime for a functional language, in some sense, requires\nyou to implement a little virtual OS on top of the real, underlying OS.) Call\nthat PhD #4. Jim subsequently left Scheme, to do parallel processor & systems\nwork with Kai Li at Princeton.\nJonathan went to MIT as a graduate student, where he worked with Gerry Sussman\nand David Gifford. After working on a series of interesting problems, Jonathan\nalso wrote his dissertation on an operating system for functional languages,\nwhere you could use language safety as the fundamental protection mechanism.\nCall that PhD #5. Then he got interested in entomology (bugs, I mean — real\nbugs, not computer bugs), did a post-doc in Europe, then came back to the US\nand has sort of bounced between pursuing research topics that are as radical\nand unusual as T was in 1982 & startup companies.\n(Jonathan also wrote his dissertation *in Scheme* as well as *about* Scheme.\nHe built a little word processor for his diss in Scheme called \"markup\" that\nallowed you to write standard text, interspersed with commands that were\ndelimited with curly braces. (Hmm. Text and commands in curly braces. Sound\nfamiliar?) Commands were defined in Scheme; the markup processor had multiple\nback-ends, such as HTML & PostScript. Scott Draves later extended markup for\n*his* dissertation on partial-evaluation and high-performance graphics\nrendering at CMU.)\nI think that covers the entire T team. It is interesting to note that *five*\ndissertation-level chunks of work (and one Master's-level chunk) came out of a\nsingle summer project.\nI've spent a fair amount of time discussing T's implementation technology.\nHowever, it is also worth study as a language *design*, and here, Jonathan\nis the single greatest influence. T was, principally, his baby. It was\nquite a beautiful design.\nWhen the RISC revolution happened, Orbit was ported to the late-80's RISC\nprocessors: MIPS & SPARC. This is when the Clark GC was ripped out and\nreplaced with the Cheney collector. At CMU, I ported Orbit to an IBM precursor\nof their POWER architecture, called the ROMP or the RT/PC.\nOne of the limiting factors of Orbit was the complexity of the back end. It\nwas documented very well by Kranz' diss, and it was very sophisticated, but it\nwas also a big mess of code. Out of a reaction to this complexity was born\nScheme 48 — when Kelsey came to Northeastern as a prof, Jonathan was still at\nMIT; he and Jonathan built Scheme 48 together. Its first use was on an\nautonomous robot system that Jonathan had gotten involved with at Cornell. The\nname was intended to reflect the idea that the implementation should be so\nclear and simple that you could believe the system could have been written in\n48 hours (or perhaps, when S48 got complex enough, this was altered to \"could\nhave been *read* in 48 hours\"). Scheme 48 had very little technical overlap\nw/T3 and Orbit — no native code compiler, no object system, no CPS IR. Its\ninnovations were its module system, the language in which its VM was defined\n(\"pre-scheme\"), and its stack-management technology. These were all\ninteresting technical bits. The stack was managed not by push & pop, but by\npush & a generational gc. I believe Kelsey wrote a paper on this and its\nadvantages. The module system was somewhat like SML's, but allowed modular\nmacros and had another fairly cool feature: when you defined a module, clauses\nlet you specify which files held the module's source. But *other* clauses let\nyou specify which \"reader\" procedure to use to translate the character stream\nin the files to the s-expression tree handed to the compiler. So you could\nhandle files with different concrete syntax — R5RS syntax, scsh syntax, S48\nsyntax, PLT Scheme syntax, guile syntax, perhaps an infix syntax (as is so\noften discussed). That eliminated an annoying, low-level but persistent\nbarrier to sharing code across different implementations of Scheme.\nPre-scheme was quite interesting. Kelsey published a paper on it, as well, I\nbelieve. It was Scheme in the sense that you could load it into a Scheme\nsystem and run the code. But it was restrictive — it required you to write in\na fashion that allowed complete Hindley-Milner static type inference, and all\nhigher-order procedures were beta-substituted away at compile time, meaning\nyou could *straightforwardly* translate a prescheme program into \"natural\" C\ncode with C-level efficiency. That is, you could view prescheme as a really\npleasant alternative to C for low-level code. And you could debug your\nprescheme programs in the interactive Scheme development environment of your\nchoice, before flipping a switch and translating to C code, because prescheme\nwas just a restricted Scheme. The Scheme 48 byte-code interpreter was written\nin prescheme. Prescheme sort of died — beyond the academic paper he wrote,\nKelsey never quite had the time to document it and turn it into a standalone\ntool that other people could use (Ian Horswill's group at Northwestern is an\nexception to that claim — they have used prescheme for interesting work).\nThere are ideas there to be had, however.\nI subsequently picked up Scheme 48 around 1992 to build scsh, but we're\nbeginning to wander from T, so I'll leave that thread.\nThe tapestry of advanced language implementation work is a very rich and\ninterconnected one, the weaving of which is is an incredibly interesting task\nthat can keep you happily occupied for a lifetime. I've only traced out one\nselected thread in that tapestry with this rambling post; there are many other\nimportant ones. But that, to the best of my knowledge, is the story of T.\nA cautionary note: the danger of writing history when all of the principals\nare still alive is that there are people around to catch you out in your\nerrors. I'm sure there *are* errors in my recollection, but I'm also\nreasonably sure I've got the broad strokes roughly correct. Someone like\nJonathan could certainly give a much more\n[authoritative](http://mumble.net/~jar/tproject/) account.\nHere are some references to papers I've mentioned.\nThis is the first paper published on T, based on T2:\nJonathan A. Rees and Norman I. Adams IV.\nT: A dialect of Lisp or, Lambda: The ultimate software tool.\nIn *Conference Record of the 1982 ACM Symposium on LISP and Functional\nProgramming,* pages 114-122, August 1982.\nThis is a general overview of T3's Orbit:\nORBIT: An optimizing compiler for Scheme.\nIn *Proceedings of the SIGPLAN '86 Symposium on Compiler Construction,*\npublished as *SIGPLAN Notices* 21(7), pages 219-233.\nAssociation for Computing Machinery, July 1986.\nThere is a later, third paper, written by Jonathan & Norman, on object systems\nin general and T's in particular. I do not have a reference.\nThe reference manual for T is also interesting reading, for information\non the features of the language:\nJonathan A.Rees, Norman I.Adams IV\nand James R. Meehan.\n*The T Manual.*\n4th edition, Yale University, Department of Computer Science, January 1984.\nI also could probably post PostScript source for it, if people care.\nKelsey's diss:\n*Compilation by Program Transformation.*\nPh.D.dissertation, Yale University, May 1989.\nResearch Report 702, Department of Computer Science.\nA conference-length version of this dissertation appears in *POPL 89*.\nKranz's diss:\nDavid Kranz.\n*ORBIT: An Optimizing Compiler for Scheme.*\nPh.D. dissertation, Yale University, February 1988.\nResearch Report 632, Department of Computer Science.\n**More Info:**\n|", "url": "https://wpnews.pro/news/history-of-t", "canonical_source": "https://paulgraham.com/thist.html", "published_at": "2026-06-30 08:50:50+00:00", "updated_at": "2026-06-30 09:20:23.835060+00:00", "lang": "en", "topics": ["artificial-intelligence"], "entities": ["Jonathan Rees", "Dan Weld", "Olin Shivers", "Yale University", "Roger Schank", "Alan Perlis", "Richard Stallman", "Richard Fateman"], "alternates": {"html": "https://wpnews.pro/news/history-of-t", "markdown": "https://wpnews.pro/news/history-of-t.md", "text": "https://wpnews.pro/news/history-of-t.txt", "jsonld": "https://wpnews.pro/news/history-of-t.jsonld"}}