Sp.h is the standard library that C deserves

The article introduces **sp.h**, a 15,000-line, single-header C99 library designed as a modern alternative to the standard C library (libc). It operates directly against system calls, avoids heap allocation by requiring explicit memory management, and replaces null-terminated strings with pointer+length structures to improve safety and performance. The author argues that libc is outdated and harmful for modern programming, particularly for asynchronous I/O and efficient string handling.

Over the past year, I’ve been working on fixing C by giving it a high quality, ultra portable standard library. It is not a simple wrapper on top of libc; it doesn’t depend on libc except when required to by the platform. To my knowledge, there is nothing like it. The library is called sp.h 1. It’s a 15,000 line, single header library written in plain C99. You can find the source code on GitHub, which includes the library itself, lots of example programs, and half a dozen baseball libraries2 which extend the core. If you prefer to read a few examples and look through the source, head to GitHub first. Otherwise, let’s get on with the pitch Table of Contents Principles Program directly against syscalls The fundamental idea is that any C standard library must be written directly against the lowest level primitives available3. It is neither useful nor productive to try to emulate, produce, or interface with the decades of cruft that have accumulated between the OS and the code that you yourself write. Libc is actively harmful It is tempting to conform to libc, because swaths of code promise to compile and run if you can simply provide an implementation of libc. But more and more, this is untrue. Libc does not provide a useful interface for any program. Simple programs would rather use a high level language. Sophisticated programs cannot be written with the primitives that it provides. This has been exacerbated over the past decade as asynchronous programming has become more important. A “fast” program is becoming less about solving e.g. register allocation better than the other compiler and more about e.g. using the right kernel primitives to do IO. Any interface upon which the fundamental unit of IO is FILE or upon which a substring is a malformed idea is not just annoying. It’s harmful. sp.h casts it aside4. There Is No Heap These types underpin the entire library: typedef enum { SP ALLOCATOR MODE ALLOC, SP ALLOCATOR MODE FREE, SP ALLOCATOR MODE RESIZE, } sp mem alloc mode t; SP TYPEDEF FN void , sp allocator fn t, void user data, sp mem alloc mode t mode, u64 size, void ptr ; typedef struct sp allocator t { sp allocator fn t on alloc; void user data; } sp mem t; They do so by forcing programs to accept that “the ability to allocate any amount of memory from the ether” is not a primitive; it is a fiction. Memory is not owned by “the runtime”. Memory is owned by your program. Null-terminated strings are the devil’s work I have written about this in the past Null terminated strings mean you cannot: - Return a non-owning substring - Know the length of a string in O 1 - Write lexers and parsers which return ergonomic views into source - Build strings without invalid intermediate values Plus, of course, the unfathomable number of bugs and security issues that arise from a missing null terminator. Step one to modernizing C is to completely ditch null terminated strings in favor of the humble sp str t . The only downside, I believed, was that you were forced to make an extra copy to interface with any other C API you might come across. I have come to find that this is completely meaningless. A C standard library built natively around pointer + length strings is shockingly ergonomic. For example, a snippet from a wc clone: sp str t content = sp zero; sp io read file mem, path, &content ; sp ht sp str t, u32 counts = sp zero; sp str ht init mem, counts ; sp da sp str t lines = sp str split c8 mem, content, '\n' ; sp da for lines, i { sp da sp str t words = sp str split c8 mem, lines i , ' ' ; sp da for words, j { u32 count = sp str ht get counts, words j ; if count { count = count + 1; } else { sp str ht insert counts, words j , 1 ; } } } If your first reaction is “so what?”, then, yeah, that’s the point. Here’s a piece of C code which reads roughly like any high level language but also never copies data from the source buffer while parsing. In other words, it’s both the most ergonomic version and the most performant version. Be a part of your software, not aside from it The library is meant to be read, modified, tweaked, rewritten, or whatever verb you might need to have it serve your purposes. I’ve worked very hard to this end: - The core of the library is ~40 syscalls which are the only platform specific code5 - The library ships as a single file which needs no configuration - The file is extremely organized, and tagged with @tag s for human or LLM search - Every function is part of a namespace Where the frustrating parts of C seek to hide the OS your program runs on behind an elaborate fiction, sp.h seeks to unify only those things which are true, as thinly as possible while being useful, and then building functionality on top of the exact same primitives that it gives you. Be extremely portable sp.h is written in C99, and it compiles against any compiler and libc imaginable. It works on Linux, on Windows, on macOS. It works under a WASM host. It works in the browser. It works with MSVC, and MinGW, it works with or without libc, or with weird ones like Cosmopolitan. It works with the big compilers and it works with TCC. And, best of all, it does all all of that because it’s small, not because it’s big. Be explicit Every time I’ve picked implicit over explicit, I’ve come to regret it and paid the price to fix it: - Errors are always returned and handled by the caller - Programs do not have mutable global state - Functions which allocate take an allocator - Memory is zero initialized Non-goals Conformance to existing interfaces This is not libc. When required to, sp.h will respect libc, and it will always work unobtrusively and completely when embedded in a libc-using program. But it is not libc, and you should not expect it to act like it is. Obscure architectures and OSes I write code for x86 64 and aarch64. WASM is becoming more important, but is still secondary to native targets. I don’t care to bloat the library to support a tiny fraction of use cases. That being said, if you’re interested in using the library on an unsupported platform, I’m more than happy to help, and if we can make the patch reasonable, to merge it. Performance The library’s stance, to put it simply, that the juice ain’t worth the squeeze when it comes to low level, compute-bound performance. Designing software and data structures for performance against unknown use cases on unknown hardware is extremely difficult and the resulting code is much more complicated. Even then, it’s often better to use code written against your actual use case and hardware when performance is that critical. Things that are off the table might be: - SIMD - A highly optimized hash table rewrite - Figuring out where inlining or LIKELY causes the compiler to produce better code. Things that are on the table might be: - Providing the correct abstractions to do optimized and/or zero copy IO - Writing APIs that do not require copying data Of course, doing fine-grained optimization where it’s hurting people is always on the table. Fixing bugs is always on the table. I am not anti optimization; just busy. A parting thought The natural question one might have is: Why are you doing this? There have never been more or better languages for systems programming. Why not just use one? The answer is that C holds a real niche, and not wholly built on legacy. To my knowledge, it’s the only language which: - Can be directly compiled to any machine code imaginable - Has an ecosystem of state-of-the-art optimizing compilers - Is written in the same language as the OS and most libraries - You could write a reasonable compiler for as a personal project In other words… C is valuable because it’s simple Of course, these are all unfair to varying degrees. LLVM exists, so technically everyone has a SOTA compiler. Most languages have FFIs and tooling. The best systems languages are better at C than C is. And yet, to have something so well-supported, so optimized, so tied to the platforms upon which we write native code, and so approachable is magical. I want to work with you I would like nothing more than to make friends and/or help you work on this library, stranger. I’ll help you port it to your weird environment. I’ll explain any of it to you. I’ll listen politely while you tell me I’m terrible at programming. I am certainly no genius at systems programming; everything I have is the product of really bad misunderstandings about how software and computers work, followed by lots of hard work and fun and more software. I’m on a Discord server or you can find me at sp on IRC. You can also email me. The domain’s the same as this site, and the handle is my last name6. The first two letters of my last name. A little vanity never hurt, right? ↩︎ They add to a single header library, so they’re double header libraries. Doubleheader. ↩︎ Where “syscall” means “the lowest level primitive available”. On Linux, it’s always actual syscalls. On Windows, that’s usually NT. On macOS, it’s usually the syscall-wrapper subset of libc because you’re forced to link libc and it’s not quite as open as Linux although there is a rich “undocumented” set of APIs and syscalls that are very interesting . ↩︎ There are some places where the library is still more POSIX-shaped than it ought to be in its lowest levels. But, hey, that’s what an alpha’s for, right? ↩︎ This is probably 85% true right now. There are a few stragglers; mostly things that I haven’t had time to properly design as the absolute minimum set of primitives and which therefore live outside the core. ↩︎ spader ↩︎