WarpMonkey

Mozilla's SpiderMonkey is the JavaScript and WebAssembly engine for Firefox, implementing ECMAScript and WebAssembly specs. It features a garbage collector, JS::Value and JSObject types, and a parser that generates bytecode via Stencil, with lazy parsing to optimize performance.

SpiderMonkey SpiderMonkey is the JavaScript and WebAssembly implementation library of the Mozilla Firefox web browser. The implementation behaviour is defined by the ECMAScript https://tc39.es/ecma262/ and WebAssembly https://webassembly.org/ specifications. Much of the internal technical documentation of the engine can be found throughout the source files themselves by looking for comments labelled with SMDOC https://searchfox.org/mozilla-central/search?q= SMDOC &path=js%2F . Information about the team, our processes, and about embedding SpiderMonkey in your own projects can be found at https://spidermonkey.dev https://spidermonkey.dev . Specific documentation on a few topics is available at: Components of SpiderMonkey 🧹 Garbage Collector JavaScript is a garbage collected language and at the core of SpiderMonkey we manage a garbage-collected memory heap. Elements of this heap have a base C++ type of gc::Cell https://searchfox.org/mozilla-central/search?q= SMDOC +GC+Cell . Each round of garbage collection will free up any Cell that is not referenced by a root or another live Cell in turn. See GC overview gc.html for more details. 📦 JS::Value and JSObject JavaScript values are divided into either objects or primitives Undefined , Null , Boolean , Number , BigInt , String , or Symbol . Values are represented with the JS::Value https://searchfox.org/mozilla-central/search?q= SMDOC +JS%3A%3AValue+type&path=js%2F type which may in turn point to an object that extends from the JSObject https://searchfox.org/mozilla-central/search?q= SMDOC +JSObject+layout type. Objects include both plain JavaScript objects and exotic objects representing various things from functions to ArrayBuffers to HTML Elements and more. Most objects extend NativeObject which is a subtype of JSObject which provides a way to store properties as key-value pairs similar to a hash table. These objects hold their values and point to a Shape that represents the set of keys . Similar objects point to the same Shape which saves memory and allows the JITs to quickly work with objects similar to ones it has seen before. See the SMDOC Shapes https://searchfox.org/mozilla-central/search?q= SMDOC +Shapes comment for more details. C++ and Rust code may create and manipulate these objects using the collection of interfaces we traditionally call the JSAPI . 🗃️ JavaScript Parser In order to evaluate script text, we parse it using the Parser into an Abstract Syntax Tree https://en.wikipedia.org/wiki/Abstract syntax tree AST temporarily and then run the BytecodeEmitter BCE to generate Bytecode https://en.wikipedia.org/wiki/Bytecode and associated metadata. We refer to this resulting format as Stencil https://searchfox.org/mozilla-central/search?q= SMDOC +Script+Stencil and it has the helpful characteristic that it does not utilize the Garbage Collector. The Stencil can then be instantiated into a series of GC Cells that can be mutated and understood by the execution engines described below. Each function as well as the top-level itself generates a distinct script. This is the unit of execution granularity since functions may be set as callbacks that the host runs at a later time. There are both ScriptStencil and js::BaseScript forms of scripts. By default, the parser runs in a mode called syntax or lazy parsing where we avoid generating full bytecode for functions within the source that we are parsing. This lazy parsing is still required to check for all early errors that the specification describes. When such a lazily compiled inner function is first executed, we recompile just that function in a process called delazification . Lazy parsing avoids allocating the AST and bytecode which saves both CPU time and memory. In practice, many functions are never executed during a given load of a webpage so this delayed parsing can be quite beneficial. ⚙️ JavaScript Interpreter The bytecode generated by the parser may be executed by an interpreter written in C++ that manipulates objects in the GC heap and invokes native code of the host eg. web browser . See SMDOC Bytecode Definitions https://searchfox.org/mozilla-central/search?q= SMDOC +Bytecode+Definitions&path=js%2F for descriptions of each bytecode opcode and js/src/vm/Interpreter.cpp for their implementation. ⚡ JavaScript JITs In order to speed up execution of bytecode , we use a series of Just-In-Time JIT compilers to generate specialized machine code eg. x86, ARM, etc tailored to the JavaScript that is run and the data that is processed. As an individual script runs more times or has a loop that runs many times we describe it as getting hotter and at certain thresholds we tier-up by JIT-compiling it. Each subsequent JIT tier spends more time compiling but aims for better execution performance. Baseline Interpreter The Baseline Interpreter is a hybrid interpreter/JIT that interprets the bytecode one opcode at a time, but attaches small fragments of code called Inline Caches ICs that rapidly speed-up executing the same opcode the next time if the data is similar enough . See the SMDOC JIT Inline Caches https://searchfox.org/mozilla-central/search?q= SMDOC +JIT+Inline+Caches comment for more details. Baseline Compiler The Baseline Compiler use the same Inline Caches mechanism from the Baseline Interpreter but additionally translates the entire bytecode to native machine code. This removes dispatch overhead and does minor local optimizations. This machine code still calls back into C++ for complex operations. The translation is very fast but the BaselineScript uses memory and requires mprotect and flushing CPU caches. WarpMonkey The WarpMonkey JIT replaces the former IonMonkey engine and is the highest level of optimization for the most frequently run scripts. It is able to inline other scripts and specialize code based on the data and arguments being processed. We translate the bytecode and Inline Cache data into a Mid-level Intermediate Representation https://en.wikipedia.org/wiki/Intermediate representation Ion MIR representation. This graph is transformed and optimized before being lowered to a Low-level Intermediate Representation Ion LIR . This LIR performs register allocation and then generates native machine code in a process called Code Generation . See MIR Optimizations ./MIR-optimizations/index.html for an overview of MIR optimizations. The optimizations here assume that a script continues to see data similar what has been seen before. The Baseline JITs are essential to success here because they generate ICs that match observed data. If after a script is compiled with Warp , it encounters data that it is not prepared to handle it performs a bailout . The bailout mechanism reconstructs the native machine stack frame to match the layout used by the Baseline Interpreter and then branches to that interpreter as though we were running it all along. Building this stack frame may use special side-table saved by Warp to reconstruct values that are not otherwise available. 🟪 WebAssembly In addition to JavaScript , the engine is also able to execute WebAssembly WASM sources. WASM-Baseline RabaldrMonkey This engine performs fast translation to machine code in order to minimize latency to first execution. WASM-Ion BaldrMonkey This engine translates the WASM input into same MIR form that WarpMonkey uses and uses the IonBackend to optimize. These optimizations and in particular, the register allocation generate very fast native machine code.