{"slug": "lobsters-interview-with-claudius", "title": "Lobsters Interview with Claudius", "summary": "Naver researcher Claudius, maintainer of LispE and TAMGU, discusses his career in symbolic AI and computational linguistics, including his work on the Xerox Incremental Parser and the limitations of rule-based language systems. He reflects on the futility of compressing language into rules despite achieving high-speed parsing and competition wins, leading to a shift toward neuro-symbolic approaches.", "body_md": "[@Claudius](https://lobste.rs/~Claudius) maintains [LispE](https://github.com/naver/lispe) and previously [TAMGU](https://github.com/naver/tamgu) at Naver, combining [array](https://github.com/naver/lispe/wiki/5.3-A-la-APL) and logic programming with [Haskell features](https://github.com/naver/lispe/wiki/5.4-A-la-Haskell). (N.b. the [wiki](https://github.com/naver/lispe/wiki) holds the documentation and articles.)\n\nIn this interview, we discuss Lisp and Prolog implementations, array languages, symbolic ([GOFAI](https://en.wikipedia.org/wiki/GOFAI)) and neuro-symbolic AI.\n\n**How did you discover programming, come to pursue a PhD etc.?**\n\nIt's not exactly a recent adventure; I started in 1980 when my father bought a computer for Christmas. Learning Basic, I faced a lot of problems because I didn't really speak English, which most of the documentation was written in. I spent a lot of time trying to understand what the `set`\n\ncommand did (put a cell on the screen, on this machine.) Then I learned to program the [Z80](https://en.wikipedia.org/wiki/Zilog_Z80) processor in machine language and decided to pursue computer science. I got a masters degree from [Paris VI](https://en.wikipedia.org/wiki/Pierre_and_Marie_Curie_University). In 1989, I moved to Montreal and started a PhD in computational linguistics. New symbolic ways of implementing grammars were really hot.\n\nI implemented a parser for my PhD thesis which was weird, because the rule system was not based on pure context-free grammars, but a set of categories which could appear on the right-hand side of your rule. People had been hunting for ways to speed this up and the solution was quite silly: Consider each category as a separate 64-bit vector, a long integer, and for each rule tag categories on the right, replacing the categories by their position on the bit vector where the value would become the index of the rule, so you could find the rule to apply from the index.\n\nFrom there, I was recruited by PARC's sister, the Xerox Research Centre Europe (XRCE) in Grenoble, where I still live. I spent 20 years with Xerox and another 10 with [Naver who bought the lab](https://www.news.xerox.com/news/NAVER-to-acquire-Xerox-Research-Centre-Europe).\n\n**What's it like working as a researcher? You've published many papers, hold patents etc. which isn't a common route in software.**\n\nCompanies only evaluate researchers in 3 ways:\n\nNow, patents or intellectual property aren't what most people think of. In industry, they function as tokens traded between companies for access to other technologies. But their importance is decreasing.\n\nI've implemented a lot of software across these years. With my PhD in linguistics, I worked with linguists to speed tools up. As an example, on top of my PhD I wrote the [Xerox Incremental Parser (XIP)](https://www.slideserve.com/marla/xerox-incremental-parsing) (summarized [here](https://string.hlt.inesc-id.pt/wiki/XIP) and implemented [here](https://github.com/clauderouxster/XIP)) which could parse 3,000 words per second.\n\n**I've been dwelling on a comment of yours:**\n\nI worked for more than 30 years on these systems and they could never work. I implemented a very fast NLP symbolic parser, for which the team I worked with created grammars for 8 languages, including Japanese. In 2007, with a grammar of 60,000 rules, we could parse at a speed of 3000 words/s (see\n\n[https://github.com/clauderouxster/XIP]for the Open Source version). The parser could extract syntactic dependencies, and could use ontologies. But language is like sand, the more you try to grab, the more you leak. There was a kind offutility in trying to compress languages into rules, nothing actually scaled up. Still, we managed to win competitions as late as 2016 with SemEval sentiment analysis, and in 2017, we also ranked first in a legal document extraction campaign organized by IBM, but to no avail. It was a lot of work, and the conclusion was very simple. We had to push our grammars as far as possible into lexical grammars, which eventually LMMs managed to really implement. We discovered very early, that context was all that mattered. We tried to create grammars that would apply to a full paragraph instead of sentences, but then the performance would plummet. The reason why LLM work, is that at each step they compress the whole context into a meaningful vector, which they then used to guide the rest of the generation process. I spent my whole life in the pursuit of a perfect parser with very brilliant people, andI really find hard to say that not only did we fail, but that LLM is the response we were looking for. -[Claude Roux]\n\nIt was a very nice journey, but sometimes you have to accept the reality of the world: Symbolic methods have been replaced. Even 10 years ago, I could only dream of the AI technologies we have today. I wanted to be able to talk to a computer, to create things just by speaking. Now, I'm trying to apply everything I learned to new methods.\n\nTalking about agents, I'm now implementing [PREDIBAG](https://github.com/naver/lispe/wiki/6.21-PREDIBAG), a [retrieval-augmented generation](https://en.wikipedia.org/wiki/Retrieval-augmented_generation) system (RAG), in LispE to help harness/constrain LLMs. It's deterministic with restricted unification.\n\nSo, XIP had dependency rules, connecting nodes from a tree into a dependency like a subject or direct-object dependency in the sentence tree. These were implemented on a first order logic engine, which inspired me to add Prolog to [TAMGU](https://github.com/naver/tamgu). But this was problematic and complicated, with Prolog's true unification process. Prolog has a problem: The closed-world problem. You can only deal with information in the environment, in the knowledge base. So I wanted something simpler with unification, but also back tracking.\n\nNow, compiling any language, you start with an AST and Lisp is already an AST... TAMGU's grammar required 400 rules (for `for`\n\n, for `while`\n\n, to instantiate a variable...) The more you want to add and experiment, the more the grammar becomes impossible to manage. (This is Python's problem. They have to concoct ever weirder notations to fit more features in.) But you can do anything in Lisp without modifying the parser at all. You just open a parenthesis, use a function and that's it! Some complain about parentheses, but most languages are just sugar coating on top, some complicated program translating another syntax into the AST. So, I wanted to show someone how TAMGU worked and thought Lisp would be a clearer way... And I discovered the joy of Lisp again! I can experiment with [APL](https://github.com/naver/lispe/wiki/5.3-A-la-APL) or [Haskell](https://github.com/naver/lispe/wiki/5.4-A-la-Haskell) in LispE, whatever I want, because I don't have to deal with extra formalisms!\n\n**What sort of projects did you work on before TAMGU?**\n\nWell, it emerged directly from XIP. We had a 20 linguist team building grammars for English (with 60,000 rules), French, Spanish, Japanese... But corpora were terrible back then, with distinct encodings etc. At first, I integrated the Python interpreter into XIP, with rules calling Python functions. Unfortunately, Python's very strict about mixing encodings with a string and would fail all the time, causing me to develop a language just to solve this... And scope crept until you could build rules on the fly or execute them on top of a grammar. I extracted this language from XIP, rewrote the interpreter a few times, renamed it a few times... In 2020, I wanted to incorporate PyTorch and got a trainee, who needed to know how TAMGU worked, leading me to LispE... In TAMGU, every instruction and data structure is an object (an instance of a C++ object), derived from the same class to live in the same vectors. Every function and every data structure has its own `eval`\n\n.\n\nGoing further back, when I started my PhD, Prolog was the way to work with grammars. But its inventor Alain Colmerauer was a friend of my PhD supervisor, so I discovered it from the implementation side first. I learned many tricks from them like indexing rules on the first argument. When you describe your rules, the first argument becomes an index. When you try to execute a new predicate, you try to see if you can use the predicate's first argument to find out which predicate to test (based on the index). When dealing with language, the first argument would often be a word or category (an atom or string); indexing on them speeds selection up a lot. The knowledge base was also implemented with indexes in the background, so instead of trying every element in your knowledge base, you'd only try the ones indexed on the word you were looking at. [WordNet](https://en.wikipedia.org/wiki/WordNet) is an interesting corpus with its own (inefficient) Prolog implementation; someone said it took about 2 minutes to load it into SWI-Prolog but using this technique I could load it into TAMGU in a few seconds! I don't use RDF and public knowledge bases much anymore though.\n\nI really loved implementing stuff in Prolog but unfortunately, Prolog couldn't efficiently handle my idea of associating every category with a position in a bit vector. I already worked on PREDIBAG with TAMGU (cf. [the wiki](https://github.com/naver/tamgu/wiki/3.4-PREDIBAG:--Building-Modern-AI-Agents-in-Tamgu's-Prolog)) reaching 98% accuracy for the [GSM8K](https://huggingface.co/datasets/openai/gsm8k) math dataset with Prolog and a model only able to reach 60%. The Prolog program would ask an LLM to create then answer a new question (creating knowledge before using capabilities), then output a Python program and test whether it outputs the dataset's expected values. PREDIBAG was all about using predicates to explore the implicit graph computed by the rules themselves. I'm now trying to bring this to the browser via LispE with its lighter, simpler rules. It's such a nice way of working with rules; backtracking is very powerful. It means that you have a single entry point (the name of your predicate) with different functions sharing that same name, which the system would sequentially try. To enrich them, you just add a new rule/implementation (instead of `if`\n\n`else`\n\nhell.)\n\n**How does LispE fit into the lab and your efforts?**\n\nI give regular presentations to the other researchers, but I have other tasks I get evaluated for. You have to remember you're getting paid by the company, so the trick's matching the company's goals with my own experiments. I made a proposal and management let me work on PREDIBAG. In the past, the goal was mostly machine translation. After the fall of the Berlin wall, machine translation seemed like the solution to welcoming new countries. I must say I hadn't come up with the best solution, but today machine translation's almost solved.\n\n**In the past I played with GOFAI grammars between Esperanto and Interlingua, two conlangs with regularized grammar. You made a conlang Lingvata with case endings etc. as a translation target/assist for machine translation. How did that go?**\n\nAt XRCE, we made [finite-state lexical transducers](https://www.redalyc.org/pdf/5157/515751735044.pdf) from dictionaries (e.g. from English to French), which were quite compact. (LispE has a [transducer library](https://github.com/naver/lispe/wiki/5.15-Transducer).) I studied Latin at school and thought declension (marking subject, object etc. at the end of the word) would help here. (In Spanish, for example, you don't have to use pronouns because the ends of the verb carry the meaning.) If the transducer could systematically identify the attributes of a word based on its ending... So I made a system with XIP to generate a Lingvata sentence!\n\nThese transducers are automatons, graphs. A lot of stuff is common to many branches of the graph, which you want to merge, compressing the lexicon into less than a MB. In the LispE transducer directory, you can create a document with surface and lemma forms. Giving that to the transducer compiler, you get a compressed system you can use throughout the library! I have transducers for many natural languages and can parse a sentence, returning e.g. (man plural noun) for \"men\".\n\n**How did you implement LispE's APL/array features? I was really happy to see function conforming (where you can e.g. (+ '(1 2 3) '(1 2 3) 3)).**\n\nIn 1984, I was studying computer science and had [Yves Escoufier](https://imag.umontpellier.fr/YvesEscoufier/) as my statistics professor. The largest factory in Montpellier was an IBM factory which collaborated with Escoufier to implement his statistical methods in APL. I joined that team! Now, we were implementing this on a *new* computer with a *floating-point chip!* So I implemented matrix multiplication in assembly for its APL version.\n\nIt was possible to make some very interesting programs on top of it and I kept it in the back of my mind. Implementing LispE, I replaced linked lists with vectors via indirection: A list is a pointer to a buffer, shareable by different lists. When sharing a buffer, you can have your own offset (start) of the buffer. Since I had those arrays, I investigated APL operators like rho, reduce etc. which were more complicated to implement, than I expected.\n\n*Eh... Cabuchon! It's my cat. He's playing with... Oh no...*\n\nEvery year, I try to do some Advent of Code and with the APL operators, many problems become trivial. [Rho, rank, iota](https://github.com/naver/lispe/blob/457a5938807ae1872d098d5f609672bf2d8c5d80/examples/AdventOfCode2021/day13.lisp#L16) are so useful! Because it's Lisp and you don't have to deal with specific formalisms, this is all relatively easy. I did the [Game of Life](https://github.com/naver/lispe/wiki/6.20-Conway-Game-of-Life-in-LispE) in this APL-Lisp too.\n\n**You implemented LispE in C++ with classes, what inspired this approach?**\n\nBecause C++ provides you with vtables, which let you make an `eval`\n\nfunction for every different class. There's an isomorphism between Lisp and the subset of C++ I use.\n\nInstead of trying to implement something complicated, I tried to leverage vtables as much as possible.\n\nI implemented a [subset of Python](https://github.com/naver/lispe/wiki/6.22-Transpiling-Python-into-LispE) in LispE to execute Python within LispE, so I understand Python well. When Python was first implemented around 1990, C++ wasn't exactly a thing. So Guido reinvented the vtable; Python uses its own (inefficient) vtable-thing. But modern c++ has so many useful features for handling strings, vectors etc. so you don't have to reroll them yourself.\n\nI do try to steal features from elsewhere too. I find Rust's notion of borrowing very interesting and inspired by this, in the [LispE Torch](https://github.com/naver/lispe/tree/master/lispetorch) implementation, you can check a structure e.g. a list of integers and transform it into a tensor automatically with 0 copy.\n\n**What other sources of inspiration have you found? You have a la APL, Haskell...**\n\nProlog. Lisp, of course. (The naming conventions like `setq`\n\ncome from [The Roots of Lisp](https://www.paulgraham.com/rootsoflisp.html).) [ontology](https://github.com/naver/lispe/wiki/5.11-Ontologies) stuff based on bit-vectors. Many people work on cool things and I understand things better when I implement them myself.\n\nI also learned from others' mistakes. The Python API is extremely complex (requiring heroic effort to tame), so the core API is very simple: Just isolate description of your function (a string) with a pointer to an object and it just works.\n\nLibTorch is a gorgeous API, a real work of art. When I started wrapping it in LispE I was amazed by its quality.\n\n**I discovered LispE from an absolutely amazing video showing off its awesome shell capabilities. I presume this is your day to day shell. How do you develop APIs and chisel them to perfection?**\n\nThrough a lot of pain.\n\nI've used many APIs in my life and my personal adage is \"never trap yourself\". Implementing complicated things, we often hit a point where we suddenly don't understand what we're doing anymore, stuck in a complicated, intertwined blob of code... I've gone through that a lot. I often have the feeling that someone else wrote those messes, but in fact it was just another me with less experience... So I try to keep things simple. \"Will I be able to read it tomorrow?\"\n\nNow, the shell interface was very complicated. I didn't want to use `curses`\n\nbecause it doesn't work on all platforms.\n\nI don't mean to disparage Python at all, but I was very dissatisfied with the Python REPL; after entering some functions and reaching a happy result, I found myself wanting to create a file out of it. So you can do `lispe -e <file name>`\n\nto start an editor within the `lispe`\n\ninterface, place breakpoints etc.\n\nIt's really simple, but I love being able to just type `!ls`\n\nto execute UNIX commands or `!v=ls -l`\n\nto bind `ls -l`\n\n's output to `v`\n\n. I just wanted to do something for myself, to create something I want to use every day.\n\nAn example of how I try not to trap myself is with naming. There's a function `link`\n\nwhich lets you rename almost anything (e.g. `(link 'plus +)`\n\n.) (`let`\n\nalso works.) But in [lispe.cxx](https://github.com/naver/lispe/blob/85be18784f0347be47256d8f2c8443a51b72e8c8/src/lispe.cxx#L378) (the entrance file to the whole language) `set_instruction`\n\nlinks surface bindings with numerical IDs (which the compiler actually uses). So you can simply change the names there and recompile LispE with new internal and external names.\n\n**What do you think about neuro-symbolic AI?**\n\nVerifiable rewards have become quite popular, where the system generates a solution verified by an external program, not just looking up the value in a table of outputs but actually executing code.\n\nIt's trendy to throw MCP everywhere (exchanging data with an external server.) In the case of [PREDIBAG](https://github.com/naver/lispe/wiki/6.21-PREDIBAG), instead of the LLM making all decisions, you can use an intermediary layer (implemented with `defpred`\n\nor LispE's pattern matching rules) tagging the decision to send something to an LLM *with a callback function* to check whether the output is valid, delegate execution to other components etc. If the system's trying to e.g. import curl, you can already intercept it there. And that's something you can only do with rules. Many people are using other LLMs to \"verify\" others, which carries no guarantee. In industry, you need guarantees that something won't crash your whole system and a traceable system.\n\nToday, the browser offers maximum security (handling auth, token IDs etc.) and you can use [LispE](https://github.com/naver/lispe/wiki/6.17-A-WebAssembly-version-of-LispE) as a WASM library too. The idea's that instead of running Python code in a Docker container etc. so you can just execute code in the browser sandbox directly.\n\n**Where do you think classic symbolic logic's still a good fit?**\n\nFor low memory text generation, I like generating via [grammars](https://github.com/naver/lispe/blob/master/examples/patterns/dcgfr.lisp). Relatedly, LispE supports unicode (and can e.g. rename everything in [Greek](https://github.com/naver/lispe/wiki/6.15-Programming-is-Greek-to-me...-Literally).) Nowadays, LLMs can generate such a grammar for you but in the past it took a lot of time and linguists. When working on the Japanese grammar, they were very surprised to be able to use Japanese and quickly began to employ Japanese for dependency names, functions, categories etc. and I couldn't read a single thing!\n\n**You started with Basic, how did you learn C(++)?**\n\nAt the time of my PhD, C++ was also the only way to access Mac's graphical environment, so I started to learn it then decided to implement my parser in it too. Of course, I avoid nightmares like multiple inheritance in C++, but it's improved a lot over the years.\n\nWhen implementing TAMGU, I was able to reduce the interpreter's complexity step by step, by removing C++ features until it became a Lisp. It's always the same story: Lisp is just an AST. If you keep the AST live in your C++ code, transform it into an evaluation tree, don't try to implement it as byte code (trendy in the 90s) but keep a simple tree where each element will evaluate itself, you end up with something so simple and efficient! In fact, you're just compiling stuff into C++ instances which execute as (fast as) C++. There's a real elegance in Lisp here.\n\nTo illustrate this another way, the output of a (LLM) transformer is a list of probabilities (probable tokens), one of which you select. A complicated language like C++ or Rust have more complicated probability distributions than Lisp, where the fewer tokens following an open parenthesis constitute lower entropy. You can have an opening parenthesis, token or closing parenthesis, that's it.\n\n**How do you leverage LLMs these days?**\n\nLLMs have a lot of knowledge but few competencies. If you constrain them to output knowledge and use that to further constrain results, you'll go far. For context management, I have the system generate `log.jsonl`\n\nand `log.py`\n\n(which queries the other document). Whenever an action is processed (an error's corrected etc.) the system adds something to `log.jsonl`\n\n. If it needs to know what happens, it uses `log.py`\n\nto query and display only the relevant/required information (like a date, errors or attempted fixes) reducing tokens.\n\nI use LLMs to generate JS UIs, but core LispE doesn't have any AI-generated code because I know the code by heart and LLMs would butcher it and work slower than me.\n\n**With infinite time, what would you like to add to LispE?**\n\nI don't see any ways of improving LispE performance right now; I need external eyes to help. After dealing with the same code over and over, it becomes difficult to push through the fog and find new ideas. The reason LispE's faster than Python is very simple: I provide a list of values. In Python, you only have a list of pointers or a dictionary of pointers; you have to use NumPy if you want vector values, but it's not user friendly.\n\nWhen using Python, you're working in 2 worlds at the same time. On the one hand, you have the virtual machine with tokens and bytecode. On the other hand, some C++ library will have no relation to that, making it really hard to understand what's going on, because in fact a large part of it is not actually accessible. In LispE, it will derive everything like the language itself. LispE's list of values are a good example - to release a list of pointers in Python, you must traverse the list and release each element individually while in LispE you can just delete the list and that's it. If something in Python was created by an external library, you have to delete the PyObject and the object within it. When creating a Python library or wrapper, for each object you create, you must make a table implementing the pointers to the functions to delete or create elements. I think this is a good way to validate LLM \"thinking\" before exploring a path further.\n\nPREDIBAG is my main focus at the moment. You have to forgive an old man, but **I love the poetic justice of bringing Prolog back to AI.**", "url": "https://wpnews.pro/news/lobsters-interview-with-claudius", "canonical_source": "https://alexalejandre.com/programming/interview-with-claude-roux/", "published_at": "2026-06-16 08:02:48+00:00", "updated_at": "2026-06-16 08:24:09.244967+00:00", "lang": "en", "topics": ["artificial-intelligence", "natural-language-processing", "ai-research", "ai-ethics", "generative-ai"], "entities": ["Naver", "Xerox Research Centre Europe", "Claudius", "LispE", "TAMGU", "Xerox Incremental Parser", "PARC", "IBM"], "alternates": {"html": "https://wpnews.pro/news/lobsters-interview-with-claudius", "markdown": "https://wpnews.pro/news/lobsters-interview-with-claudius.md", "text": "https://wpnews.pro/news/lobsters-interview-with-claudius.txt", "jsonld": "https://wpnews.pro/news/lobsters-interview-with-claudius.jsonld"}}