There Is Life Before Main in Rust A Rust developer explores what happens before the main function executes, detailing the runtime initialization phase and introducing novel techniques for mutable data. The post highlights how the Rust runtime builds atop C's runtime to configure panics, unwinding, and program arguments, and demonstrates the use of the linktime project for pre-main bootstrapping. There Is Life Before Main in Rust permalink /blog/2026/06/11/life-before-main/ Disclosures 🧠 This post is 100% human-written /assets/2026/06/remarkable life after main.pdf . Claude was used for feedback and to assist with the linker symbol diagram. Cursor was used for feedback and to ensure examples were compilable. The author of this post is deeply interested in the topic of life-before-main: he is the author of the ctor https://crates.io/crates/ctor crate, and the creator of the project that we’ll be using in the examples below. https://github.com/mmastrac/linktime linktime Every Rust binary has one thing in common: fn main . If you come from the C world, that might be more familiar as int main argc, argv . Some platforms might obfuscate it a bit more, but under the hood, every binary has an entrypoint. We’re going to discuss what happens before main and what interesting things we can do there. In addition, we’ll be showing some novel techniques for mutable data that aren’t in common use in the Rust ecosystem today. This post is a deep dive into some technical details of how Rust source becomes a Rust binary. Some background knowledge may be helpful to the reader, including: Before main What might not be familiar to most developers is how you get into the main function. You see, under the hood for every language is the runtime . C has one: the C runtime that you might recognize as libc https://en.wikipedia.org/wiki/C standard library . Rust also has its own runtime: the Rust standard library. And because C is the lingua franca of runtimes for most executable code , Rust builds its own runtime atop of C’s, effectively building its own higher-level abstraction encapsulating C’s. 1 fn:0 A runtime is a bit fuzzy to define. It’s both the executable code that lives on disk and compilable headers and libraries used at compile time. But the purpose of a runtime is always the same: integrating developer code with the platform’s operating system. There’s an entire ecosystem of processing that happens before the function you declared as main starts up. C uses this to configure allocation, file access, thread-local storage and other C runtime services. Rust uses this time to configure parts of its own language and runtime. Specifically, Rust has infrastructure to handle panics and unwinding. Rust also needs to translate the C-style program arguments 2 into its own interface. The machinery for all this is https://doc.rust-lang.org/beta/std/env/fn.args.html std::env::args visible in the Rust compiler project https://github.com/rust-lang/rust/blob/main/compiler/rustc codegen ssa/src/base.rs L501 . Runtimes make use of this pre-main phase because it guarantees 1 running before user code, and 2 a single-threaded, highly-consistent and predictably-ordered environment, which allow for reliable and deterministic initialization. By not taking advantage of this environment, you are missing out on a very useful bootstrapping phase. We’ll see later on in this post how we can build some useful primitives making use of life before main. Entry Points A binary starts when the operating system’s loader 3 - the part of the OS that loads the binary into memory and sets up the environment - hands off control. The runtime is responsible for accepting the hand-off from the loader. There’s a platform-specific hook on every OS that accepts the hand-off - to some extent this is the real main. On Linux, the entry point is stored in the https://en.wikipedia.org/wiki/Executable and Linkable Format :~:text=e entry of the ELF header, and by default, the linker places the address of a symbol named e entry field start there. A similar hook exists on Windows https://learn.microsoft.com/en-us/windows/win32/debug/pe-format :~:text=AddressOfEntryPoint , and boots the executable in a function named https://stackoverflow.com/questions/1583193/what-functions-does-winmaincrtstartup-perform . At this point the C runtime has a chance to configure itself, and the way that all runtimes do this is via initialization functions. WinMainCRTStartup In early iterations of runtimes, bootstrapping was a static tree of function calls: initialize file I/O, initialize the allocator, etc. As runtimes became more complex, this tree of function calls became more complex, and binary sizes increased to absorb more C runtime functionality that they may or may not need. Over time, linkers developed the ability to discard unused code before even writing the binary to disk including unused parts of the C runtime , and with that came a need for a replacement for the static init call trees. The most popular method 4 of declaring init code came from GCC: attribute constructor . The way this worked was to place a list of init functions into a contiguous chunk of the binary on disk. When the C runtime started, it could walk through each of these functions and call them, allowing various bits of the C runtime to request initialization without strongly coupling subsystems, and allowing the linker to jettison unused subsystems, init code and all.Eventually the need for constructor ordering became important enough that constructors could be given a priority and run in a specific order, allowing the runtime to initialize subsystems before and after each other. E.g., the memory allocation malloc subsystem might be needed for buffered file I/O. On most platforms 5, the linker was called in to do the priority work: each platform ended up with a way to prioritize the order in which data gets written to sections, which allowed for the C runtime to end up with a well-ordered list of function pointers . 6 fn:4 We can even build an example of this by hand in Rust using the unsafe link section = "..." attribute try it in the Rust Playground https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=61a54d3cde75e732db52558f4ef9381c : /// Linux example: the modern glibc runtime uses .init array to hold function /// pointers, and a numeric suffix allows them to be ordered. Note that priorities /// less than or equal to 100 are reserved for the runtime itself, so any code that /// wants to use the C runtime must use a priority of 101 or higher. // On Linux, .init array holds function pointers , not functions. // We can convert a function to a function pointer with one of the below // blocks which is equivalent to this: // // used // <-- without this, Rust might decide the init function is unused and remove it // unsafe link section = ".init array.NNNNN" // <-- the section where we place the function pointer // static INIT ARRAY FN PTR: extern "C" fn // = function; // <-- the function pointer data: we assign the function to it // // extern "C" fn function { ... } // <-- the function itself used unsafe link section = ".init array.101" static INIT FN FIRST: extern "C" fn = const { extern "C" fn init { println "Initializing first " ; } init }; used unsafe link section = ".init array.201" static INIT FN SECOND: extern "C" fn = const { extern "C" fn init { println "Initializing second " ; } init }; fn main { println "Main " } linktime: ctor, link-section and more The examples in this post will work on Linux and various BSDs, but are not designed to be cross-platform examples. For example, macOS has start and stop symbols, but they are named differently 7. Windows does not support start and stop symbols, but has a set of rules for sorting sections that is effectively equivalent.Because platforms are so widely variable, we’ll be introducing the ctor https://crates.io/crates/ctor and crates from the https://crates.io/crates/link-section link-section as a way to abstract away platform-specific differences and hide the general complexity of linker work. https://github.com/mmastrac/linktime linktime projectThe excellent inventory https://crates.io/crates/inventory and are two other very popular crates built on the same principles, but have limitations https://crates.io/crates/linkme linkme that make them less suitable for the examples in this post. 8 fn:c1 If you’d like to learn more, the link-section crate contains a detailed report on platform-specific behaviour https://crates.io/crates/link-section :~:text=Platform%20Support . The ctor https://crates.io/crates/ctor crate is designed to handle all of the boilerplate of registering constructors in a cross-platform way. This allows us to simplify our examples above to: use ctor::ctor; ctor unsafe, priority = 101 fn init1 { println "Initializing first " ; } ctor unsafe, priority = 201 fn init2 { println "Initializing second " ; } fn main { println "Main " } Note that neither example explicitly calls the init functions. The linker organized them in a way that the C runtime called them for us Sections and Linker Scripts The process in which constructors are linked isn’t mysterious, though. In fact, compilers allow you to name the location in the binary on most platforms called a “section” you want to place any of your data and/or code. And by extension, and as we saw above, Rust allows this as well. The challenge, as we will see, is making use of this organizational feature. Linkers have been the key to C’s ability to target any form of binary for some time. Most linkers allow for developers to provide linker scripts - text files that live alongside your source code which is compiled to object files and instruct the linker on how those object files are assembled. Using a linker script, a single C file might become a Linux executable, or a block of raw assembly that lives in the boot sector of a hard drive. Linker scripts also allow for defining virtual symbols - that is, symbols that don’t exist in any source file but can be used by C code to access pointers to the underlying data in the loaded binary. Linker scripts are a complex topic and beyond the scope of this post, but we can easily find examples https://wiki.osdev.org/Linker Scripts of them in the wild: // Adapted from https://wiki.osdev.org/Linker Scripts SECTIONS { .text.start KERNEL BASE : { startup.o .text } .text : ALIGN CONSTANT MAXPAGESIZE { TEXT START = .; .text TEXT END = .; } .data : ALIGN CONSTANT MAXPAGESIZE { DATA START = .; .data DATA END = .; } } In the above example, the virtual symbols TEXT START and TEXT END are explicitly defined to point to the beginning and end of the .text section, respectively. The period in TEXT START = .; is a special syntax that refers to a location counter https://sourceware.org/binutils/docs/ld/Location-Counter.html that resolves roughly to the current output address in the binary. Linker Symbols This trips up most developers that encounter it for the first time, but the linker is setting the address of the start and end symbols , and therefore where the static with the same name is placed, and not setting the value of symbols that are pointers. That is to say: the start and stop symbols aren’t a const Type . The start and stop symbols carry no data themselves and are used for their addresses only The section consists of the range of data between the start inclusive and stop exclusive symbols. | Section | Static | Value | Linker symbol s | | |---|---|---|---|---| my numbers | DATA 1 | 11 | ⎫ ⎬ ⎭ | DATA 1, start my numbers | DATA 2 | 22 | DATA 2 | || DATA 3 | 33 | DATA 3 | || DATA 4 | 44 | DATA 4 | || past the end | ↤ | stop my numbers | Specifying start and end symbols for every section can be complex and tedious in linker scripts, so many linkers 9 eventually gained a feature where they could automatically define symbols bounding all sections in the executable. E.g., for GNU toolchains, a section named MY SECTION will automatically have symbols start MY SECTION and stop MY SECTION defined. macOS has a similar pattern https://discourse.llvm.org/t/lld-support-for-ld64-mach-o-linker-synthesised-symbols/45145 where it synthesizes a section$start and section$end symbol for each section.In the GNU linker, those sections not explicitly defined in the linker script are called “orphan sections” 10. One important thing to note: if and only if a section’s name is compatible with a C symbol name, the linker will automatically define a start - and stop -prefixed symbol for the section. In the example you’ll see below, the section name our strings that we used works, but if we had chosen our.strings or .our strings it would not have You’ll see in the example below that the start and stop symbols are MaybeUninit< . The boundary symbols contain no data, and only their address is significant. The ideal Rust type for these would be an “opaque external type” this would be implemented by the extern types feature https://doc.rust-lang.org/beta/unstable-book/language-features/extern-types.html . As these are not currently implemented in Stable Rust, MaybeUninit is a stand-in. It signifies to the compiler that the data is uninitialized, and generally not safe to read via reference. Since taking a to a https://blog.rust-lang.org/2024/10/17/Rust-1.82.0/ native-syntax-for-creating-a-raw-pointer &raw const pointer static item is always valid, however, we can still safely capture its address without ever reading its value. Try it in the Rust Playground https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=1696bdc67f02992cfde9de752e117e0e : use std::mem::MaybeUninit; used unsafe link section = "our strings" static FIRST STRING: &'static str = "Hello, "; used unsafe link section = "our strings" static SECOND STRING: &'static str = "world "; // Note: these are not pointers. Instead, the linker has placed // the boundary symbols STATIC STRING START and STATIC STRING END at // the start and end of the section unsafe extern "C" { link name = " start our strings" static STATIC STRING START: MaybeUninit< ; link name = " stop our strings" static STATIC STRING END: MaybeUninit< ; } fn main { let strings: &'static &'static str = unsafe { // SAFETY: get the addresses of the start and end symbols without // reading them. let start = &raw const STATIC STRING START as const &'static str; let end = &raw const STATIC STRING END as const &'static str; std::slice::from raw parts start, end.offset from start as usize }; // "Hello, world " println "String: {}", strings.join "" ; } The link-section https://crates.io/crates/link-section crate is designed to abstract away the details of these linker sections and convert them into traditional Rust slices with all standard slice operations available. We can use it to simplify the example above to: use link section::{in section, section}; section typed static OUR STRINGS: link section::TypedSection<&'static str ; in section OUR STRINGS static FIRST STRING: &'static str = "Hello, "; in section OUR STRINGS static SECOND STRING: &'static str = "world "; fn main { println "String: {}", OUR STRINGS.join "" ; } In these examples we’re submitting items to the link section in a single module within a single crate, but that’s not a requirement. In fact, the power of link sections is that you can submit items to a link section from any crate that contributes code to a binary - the linker will gather them all together just before writing the final binary. Dependency Injection The registration pattern we’re about to build is Dependency Injection https://en.wikipedia.org/wiki/Dependency injection by another name. This is a well-known pattern: frameworks like Dagger https://dagger.dev/ and Spring https://spring.io/ are built on the same principle that consumers of registration data should not be coupled to the providers of that data. A provider registers data at its definition site, a consumer simply reads the registry. What’s somewhat different with linker sections versus traditional DI is that in DI the framework often needs to walk the module graph or scan loaded classes at startup to discover both providers and consumer sites. With linker sections, this magic is handled when the binary is written. The linker is the one that gathers all of the provider data and makes it trivially available to the consumer. The example below uses a link section::section to register CLI subcommands and is an instance of this pattern. More complex projects like Turbopack https://github.com/vercel/next.js/blob/canary/turbopack/ use this pattern to register string-pool constants, and the registration machinery used for serialization/deserialization and turbotask incremental compilation functions https://web.archive.org/web/20250222021941/https://turbo.build/pack/docs/incremental-computation . A hypothetical webserver could make use of this pattern to register routes and middleware that are automatically collected at build time. The core mechanism is the same: the contributors place data into a shared registration system from any crate in the dependency tree, and the consumer reads the collected data without having to know where it was provided from. Using Sections for Registration One advantage we have in doing work before main is that it is well-behaved. No threads are running unless we start them. This means we are able to avoid the complexity of locks and other synchronization primitives in many cases, and that we can explicitly split our writable and immutable phase of our data’s lifecycle clearly: before and after main. And because of that, accessing data in the running program can become both simpler and more efficient by avoiding the need to acquire and release locks. First, we’ll define our subcommand, a const constructor function, and a section to collect them: use std::collections::VecDeque; use std::path::Path; use link section::{in section, section}; struct CliSubcommand { is default: bool, name: &'static str, description: &'static str, f: fn &Path, & String , } impl CliSubcommand { const fn new name: &'static str, description: &'static str, f: fn &Path, & String - Self { Self { is default: false, name, description, f } } const fn new default name: &'static str, description: &'static str, f: fn &Path, & String - Self { Self { is default: true, name, description, f } } } section typed static CLI SUBCOMMANDS: link section::TypedSection