I built a Redis clone in C: From blocking I/O to epoll to crash safe persistence

The article describes the author's experience building "C-dis," a Redis clone in C, to gain a practical understanding of networking and systems programming. It explains the progression from a simple blocking server to a high-performance single-threaded architecture using epoll for I/O multiplexing, which can handle tens of thousands of requests per second without threads. The project also involved implementing a custom hash table from scratch for data storage and crash-safe persistence.

I had been hearing about sockets, TCP connections, and networking for years without truly understanding what any of it meant at the code level. Building C-dis changed that. I learned the full socket lifecycle, what happens on both sides of a connection, how the kernel manages buffers, and how to handle non-blocking I/O states inside a manual event loop.What surprised me most was that a single-threaded architecture using epoll could handle tens of thousands of requests per second: no threads, no locking, no complexity. Architectural efficiency beats brute-force processing more often than people realize.The problem: One client at a timeMy first version was a simple blocking server. It worked, but the limitation became obvious immediately. while 1 { int client fd = accept server fd, struct sockaddr &client addr, &client len ; // Server is completely frozen here until this client finishes handle client client fd, ht ; close client fd ; } c While handle client is running, the server cannot accept new connections, cannot read from other clients, and cannot do anything else. One client holds the entire server hostage. A second client connecting during this time sits in the kernel's backlog queue, waiting silently with no feedback. In production, this means a slow client degrades the experience for every other user on the server.The fix isn't to add threads. The fix is to stop blocking.The data engine: Building a hash table from scratchBefore writing the network layer, I needed somewhere to store data. The obvious choice was a simple array with linear search, but that is O n lookup. For a database, that is unacceptable.I looked at how Redis 2.0 approached this. Hash tables. O 1 average lookup, regardless of how many keys are stored. I implemented one from scratch using the djb2 hash function: js static unsigned int hash const char key { unsigned long int hashval = 5381; int c; while c = key++ hashval = hashval << 5 + hashval + c; return hashval & HASH SIZE - 1 ; } The hashval << 5 + hashval is equivalent to hashval 33, a multiplier chosen because it distributes keys evenly across buckets with minimal clustering. The & HASH SIZE - 1 replaces modulo with a bitwise AND, which is faster when the table size is a power of two.When two keys hash to the same bucket a collision they chain together in a linked list. Lookup walks the chain comparing keys until it finds a match or hits NULL. With a good hash function and a reasonable load factor, chains stay short and lookup stays effectively O 1 .select vs epoll: The architecture that makes this fast to handle multiple clients without threads, I needed I/O multiplexing: a way to ask the kernel "which of these sockets has data ready?" and act only on those.I implemented select first, then replaced it with epoll. The difference taught me more about systems design than any other part of this project.The problem with select:Every time you call select, you rebuild the entire watch list from scratch and hand it to the kernel. The kernel scans every file descriptor in the set whether active or idle and returns a modified bitmask showing which ones are ready. With 1,000 connected clients and only 1 active, the kernel scans all 1,000 every time. That is O n per event loop iteration. while 1 { fd set readfds; FD ZERO &readfds ; FD SET server fd, &readfds ; for int i = 0; i < MAX CLIENTS; i++ if clients i = -1 FD SET clients i , &readfds ; // rebuild every iteration select max fd + 1, &readfds, NULL, NULL, NULL ; // now scan all fds again to find which ones triggered } How epoll fixes it:epoll moves the interest list into the kernel permanently. You register a file descriptor once with epoll ctl. The kernel maintains it. When you call epoll wait, it returns only the file descriptors that are actually ready, not all of them. C // register once epoll ctl epfd, EPOLL CTL ADD, client fd, &ev ; // wait: returns only ready fds int n = epoll wait epfd, events, 64, -1 ; for int i = 0; i < n; i++ { // every event here is guaranteed ready handle events i .data.fd ; } This is a synchronous benchmark. Each command waits for a response before sending the next. It measures worst case single client throughput. Pipelining multiple commands per connection would push this significantly higher.Zero memory leaks across 40,000 operations verified with valgrind --leak-check=full. Every malloc in the hash table has a matching free in ht destroy.What is nextThe next project is a container runtime: implementing Linux namespaces, cgroups, and pivot root in C to build a minimal docker run from scratch. Understanding how containers actually work at the kernel level, not just how to use them.The code for C-dis is on GitHub. If you are building something similar or have questions about the epoll implementation, I am happy to talk through it in the comments below