{"slug": "interesting-links-may-2026", "title": "Interesting links - May 2026", "summary": "Apache Kafka 4.3.0 has been released with new features detailed in the release announcement and a video from Sandon Jacobs. Jack Vanlightly introduced Dimster, a new performance benchmarking tool for Apache Kafka, and published several related blog posts. The Confluent Parallel Consumer library was marked as no longer maintained, prompting a LinkedIn discussion and a fork from original author Tony Stubbs.", "body_md": "Welcome to May’s *Interesting Links*! This month saw the Current conference in London with [the usual 5k run](https://rmoff.net/categories/5k-run/walk/), lots of familiar faces and friendly conversations—and plenty of excellent breakout sessions too. It seems live-tweeting conferences isn’t a thing any more, with only myself and Thomas Cooper seeming to post anything, but if you want you can go review [the hashtag feed on BlueSky](https://bsky.app/search?q=%23current26) for some highlights of the conference.\n\nI got my first Hacker News front page hit with [AI Slop is Killing Online Communities](https://rmoff.net/2026/05/06/ai-slop-is-killing-online-communities/) (51k views and climbing!), and a nice little halo boost for another rant from earlier this year, [AI will fsck you up if you’re not on board](https://rmoff.net/2026/03/06/ai-will-fuck-you-up-if-youre-not-on-board/).\n\nOh, and I got involved in some thought leadering over on LinkedIn (*which a non-zero number of people thought was serious*) with my [shitposting about fried breakfasts](https://www.linkedin.com/posts/robinmoffatt_current26-current26-leadership-activity-7463189069041180672-CFR4?rcm=ACoAAAC2ckIBstmoM1I4uBi9Djg8B7e0JaBvqzQ).\n\n{{< il-header >}}\n\n🔥 Apache Kafka 4.3.0 has been released. Check out the [release announcement](https://kafka.apache.org/blog/2026/05/22/apache-kafka-4.3.0-release-announcement/), as well as [a video from Sandon Jacobs](https://www.youtube.com/watch?v=lePgrOiX11U) covering the new features.\n\n🔥 After a few quiet months on his blog, Jack Vanlightly is back with a bang! He’s written a new tool, [Dimster, a performance benchmarking tool for Apache Kafka](https://jack-vanlightly.com/blog/2026/5/20/introducing-dimster-a-performance-benchmarking-tool-for-apache-kafka), and has written several more blog posts off the back of it:\n\n🔥 I had the absolute pleasure to watch Victor Rentea present at Devoxx UK earlier this month. This guy redefines what it means to be an entertaining, energetic, enthusiastic—and educational presenter. Whilst his specific talk, \"Event-Driven Architecture Pitfalls\" isn’t online yet, you can find [the slides here](https://victorrentea.ro/eda-pitfalls), and [a recording from Devoxx last year](https://www.youtube.com/watch?v=0SnuppAHOlQ) of a similar talk.\n\nThe Parallel Consumer library from Confluent has been marked as no longer maintained, prompting a [discussion](https://www.linkedin.com/posts/charles-larrieu-casias_nooooo-the-confluent-parallel-consumer-library-share-7465111475133685760-oBvc/) of alternatives (and the concept itself) on LinkedIn, as well as [a fork](https://github.com/astubbs/parallel-consumer) from one of the original authors, Tony Stubbs.\n\nMariano Gonzalez - [Benchmarking KPipe against the parallel-Kafka libraries you would actually pick](https://mariano-gonzalez.com/posts/post-5/).\n\nMichel Tricot - [Event-Driven vs. Polling Architectures for Agent Triggers](https://agentblueprint.substack.com/p/event-driven-vs-polling-architectures).\n\nAn interesting idea from Florent Ramiere and colleagues: what if you specify a set of interesting additions to Kafka’s functionality, with strict rules around the implementation, and then have LLMs take their best shot at it? You can see the ideas and results in [the branches of this repository](https://github.com/conduktor/current-london-2026).\n\nViquar Khan - [Architecting Cloud-Native Kafka: From Tiered Storage Towards a Diskless Future](https://www.infoq.com/articles/architecting-cloud-native-kafka/).\n\nElad Eldor - [Kafka’s Real Compression Problem Is Batch Depth](https://dev.to/eeldor/kafkas-real-compression-problem-is-batch-depth-515k), and [Kafka Compute Is Cheap. Network Is Not](https://dev.to/eeldor/kafka-compute-is-cheap-network-is-not-2bdh).\n\nKroxylicious [version 0.21.0 has been released](https://kroxylicious.io/blog/kroxylicious-proxy/releases/2026/05/15/release-0_21_0.html), and Sam Barker from the Kroxylicious project has been running some [benchmarks to look at the impact that the proxy has](https://kroxylicious.io/benchmarking/performance/2026/05/28/benchmarking-the-proxy.html), both pass-through and when encrypting records.\n\nAiven’s Juha Mynttinen explores why they think [Apache Kafka Deserves Topic Types](https://aiven.io/blog/kafka-deserves-topic-types).\n\nDetails of [a Coinbase outage](https://x.com/rwitoff/status/2052863502424133949) involving their Kafka provider, which based on blogs from [2022](https://www.coinbase.com/en-gb/blog/kafka-infrastructure-renovation) and [2023](https://aws.amazon.com/blogs/aws-cloud-financial-management/how-coinbase-built-a-cloud-center-of-excellence-to-optimize-their-cloud-costs-on-aws/) is MSK.\n\nAndy Muir - [Kafka Schema Registry doesn’t guarantee compatibility (and what actually does)](https://muirandy.wordpress.com/2026/04/30/kafka-schema-registry-doesnt-guarantee-compatibility-and-what-actually-does/).\n\nBruno Cadonna - [OpenData Buffer: HA pipelines without Kafka](https://www.opendata.dev/blog/buffer-ha-pipelines-without-kafka).\n\nJeffrey J. Jennings - [Kafka’s quiet observability superpower - Kafka Interceptors](https://medium.com/@jeffrey.j.jennings/kafkas-quiet-observability-superpower-kafka-interceptors-aca88c33867e).\n\nGrzegorz Kocur - [Do Kafka metrics have to be so difficult?](https://monedula.dev/blog/kafka-metrics-opentelemetry-otlp-monedula-metrics-reporter/)\n\nFlink’s Stateful Functions (StateFun) is not maintained by the project any more, so kzmlabs' Oleksandr Kazimirov [forked it](https://kzmlabs.github.io/flink-statefun/articles/forking-statefun/) to continue developing it.\n\nOlena Vodzianova - [How Chandy-Lamport Inspired Apache Flink Checkpointing](https://medium.com/@wizzywooz/how-chandy-lamport-inspired-apache-flink-checkpointing-256db84084ce).\n\n🔥 Two good posts from the team at Grab:\n\nDetails of [how Smartsheet use Flink](https://aws.amazon.com/blogs/big-data/how-smartsheet-built-real-time-dynamic-filtering-on-apache-flink-reducing-40k-month-in-messaging-costs/) for optimising both costs and performance by filtering messages.\n\n[flink-state-explorer](https://github.com/Eric-D/flink-state-explorer) is, as the name suggests, a tool for exploring Apache Flink 1.20 canonical savepoints interactively.\n\nA hands-on github repo from Patrick Neff showing off [Stream processing pipeline using dbt and Flink on Confluent Cloud](https://github.com/pneff93/dbt-cc-stream-processing).\n\nShuva Jyoti Kar - [Designing stateful serverless Agentic Loop](https://medium.com/google-cloud/designing-stateful-serverless-agentic-loop-bb73a63562b4) with Kafka and Flink.\n\nA couple of security issues for Flink to be aware of if you’re running it:\n\n🔥 Tristan Handy - [BI’s Second Unbundling](https://roundup.getdbt.com/p/bis-second-unbundling).\n\nA good writeup from Cloudflare’s James Morrison and Christian Endres about tracing performance issues in ClickHouse.\n\nSeveral posts from StarRocks covering new features in 4.1:\n\nTwo BigQuery optimisation/cost saving articles, from [Christophe Oudar](https://medium.com/teads-engineering/how-we-cut-bigquery-slot-usage-by-90-on-one-of-our-most-resource-hungry-service-after-an-outage-c491af09e77e) and [Azeem Jalageri](https://medium.com/@azeemjalageri/23fc5efc91a5?sk=2d8855c53c8d878b6afa7a839b30ef09).\n\nDaniel Beach - [Spark is Dead. Long Live DuckDB](https://www.confessionsofadataguy.com/spark-is-dead-long-live-duckdb/).\n\nAlibaba added DuckDB into their fork of MySQL, AliSQL, providing [storage and query for OLAP workloads](https://www.alibabacloud.com/blog/when-mysql-meets-the-columnar-storage-engine-duckdb-in-the-ai-era_603117).\n\nSimon Aubury - [I don’t need an untrusted LLM to tell me I’m spending too much on coffee](https://simonaubury.substack.com/p/i-dont-need-an-untrusted-llm-to-tell).\n\nThe DuckDB team announced [Quack: The DuckDB Client-Server Protocol](https://duckdb.org/2026/05/12/quack-remote-protocol.html).\n\nBen Fleis explores [DuckDB’s support for Delta and Unity Catalog](https://duckdb.org/2026/05/07/delta-uc-updates).\n\n🔥 I’ve been a fan of [Mark Litwintschik’s](https://tech.marksblogg.com/) no-nonsense blog posts showing current technologies and exploring interesting data sets for many years. In this one he uses DuckDB to analyse details of [10K+ Satellites in Space](https://tech.marksblogg.com/gcat-satellite-database.html).\n\n🔥 Nikola Ilic - [Data Modeling for Analytics Engineers: The Complete Primer](https://towardsdatascience.com/data-modeling-for-analytics-engineers-the-complete-primer/).\n\nAirTable’s Matthew Jin [details how they optimised their costs](https://medium.com/airtable-eng/how-we-reduced-archive-storage-costs-by-100x-and-saved-millions-21754b5a6c8e) by moving PBs of cold data from MySQL to S3, and wrote a query engine using Data Fusion to serve it.\n\nBrian Brunner and his colleagues at Cloudflare published details of [how they built Cloudflare’s data platform and an AI agent on top of it](https://blog.cloudflare.com/our-unified-data-platform/).\n\nCaesario Kisty - [A Practical Implementation of Medallion Architecture Using ClickHouse](https://blog.dataengineerthings.org/a-practical-implementation-of-medallion-architecture-using-clickhouse-484ec6dd960c).\n\nXinran Waibel - [Data Engineering Open Forum 2026 Recap](https://blog.dataengineerthings.org/data-engineering-open-forum-2026-recap-b0154b770315).\n\n🔥 After doing a bit of fairly naïve experimentation with Claude and dbt [earlier this year](https://rmoff.net/2026/03/11/claude-code-isnt-going-to-replace-data-engineers-yet/), I was very interested to read Jason Ganz’s article [What data agent benchmarks do and don’t tell us](https://roundup.getdbt.com/p/what-data-agent-benchmarks-do-and), and hope to try out the referenced [ADE-bench](https://github.com/dbt-labs/ade-bench#user-content-fn-1-43049741a33bb2b20904cc0f5298be23) (\"a framework for evaluating AI agents on data analyst tasks\") soon.\n\nWhilst Thijs Nieuwdorp’s article about [Handling Schema Issues in Polars](https://pola.rs/posts/schema-evolution/) is specific to Polars, it’s a useful reference for the kind of schema changes one will want to make in a data pipeline, and the challenges it can cause depending on how or if your implementation technology of choice supports it.\n\n🔥 Pedram Navid - [We need to talk about dbt](https://databased.pedramnavid.com/p/we-need-to-talk-about-dbt).\n\nA summary/re-write by Alex Yu (a.k.a. ByteByteGo) looking at [How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time](https://blog.bytebytego.com/p/how-figma-upgraded-data-pipeline) (based on [the original blog post by Yichao Zhao](https://www.figma.com/blog/figmas-data-pipeline-upgrade/) from last year).\n\nNetflix - [The Evolution of Cassandra Data Movement at Netflix](https://netflixtechblog.medium.com/the-evolution-of-cassandra-data-movement-at-netflix-6e13329c80a1).\n\nAlexey Makhotkin has a two parter on Data Quality [part 1](https://minimalmodeling.substack.com/p/my-take-on-data-quality) / [part 2](https://minimalmodeling.substack.com/p/my-take-on-data-quality-tier-2).\n\nChris Hillman - [Don’t Go Dark: Visibility Is a Data Engineering Skill](https://ghostinthedata.info/posts/2026/2026-05-23-dont-go-dark/).\n\nAfter the excellent survey and results that Joe Reis published about data engineering earlier this year, he’s now following up with a survey on [The Organizational State of Data Engineering](https://joereis.substack.com/p/the-organizational-state-of-data) (open for submissions until Sunday, June 21).\n\nMahendran Vasagam - [From SSH to REST: A Security-Driven Modernization of Slack’s EMR Data Pipelines](https://slack.engineering/from-ssh-to-rest-a-security-driven-modernization-of-slacks-emr-data-pipelines/).\n\nDana Rabba - [Building Self-Healing Data Pipelines at Halodoc](https://blogs.halodoc.io/building-self-healing-data-pipelines-at-halodoc/).\n\n[rocky](https://github.com/rocky-data/rocky) is a dbt alternative that looks quite interesting. It describes itself as \"the trust plane for your warehouse\", and targeting Databricks users primarily, with Snowflake and BigQuery to follow. There’s a built-in playground feature that’s worth poking around to get a feel for it.\n\nMarc Bowes describes how [Aurora DSQL’s CDC feature works](https://marc-bowes.com/dsql-coupler.html). If you want more, there’s further details of its use from [Vijay Karumajji](https://aws.amazon.com/blogs/database/getting-started-with-change-data-capture-in-amazon-aurora-dsql/).\n\n🔥 George Zefko - [Building a CDC pipeline, part 2: Debezium Internals](https://georgioszefkilis.substack.com/p/building-a-cdc-pipeline-part-2-debezium) (I featured part 1 last month, if you missed it it’s [here](https://georgioszefkilis.substack.com/p/building-a-cdc-pipeline-part-1-postgresql)).\n\nA couple of good posts from the Debezium team:\n\n[Apache Iceberg 1.11](https://iceberg.apache.org/releases/#1110-release) has been released (I even got some [small contributions](https://github.com/apache/iceberg/releases/tag/apache-iceberg-1.11.0) merged 🎉). There are more details of the release in these blog posts:\n\n🔥 The talks from Iceberg Summit 2026 are now [online](https://www.youtube.com/watch?v=4Bg64WnkfgE&list=PLkifVhhWtccxSA6VskdKdLnIwCJevOqFL).\n\nAlex Merced begins an epic 15-part series about Apache Iceberg by looking at [What Are Table Formats and Why Were They Needed?](https://medium.com/data-engineering-with-dremio/what-are-table-formats-and-why-were-they-needed-7d5ca69546a1)\n\nYelp’s Nick Del Nano looks at [How Partition Access Visualizations Reduced Data Lake S3 Cost by 33%](https://engineeringblog.yelp.com/2026/05/partition-access-visualizations.html).\n\nHonest words from Fresha’s Samuel Valente as he looks at the use of Iceberg with Snowflake in practice: [Snowflake with Iceberg: Lakekeeper, dbt, and some Sparks Flying](https://medium.com/fresha-data-engineering/snowflake-with-iceberg-lakekeeper-dbt-and-some-sparks-flying-a6231fcb35a7).\n\nDaniel Guzman-Burgos describes [bintrail which provides time-travel SQL for MySQL](https://blog.dbtrail.com/time-travel-sql-for-mysql-finally/). Renato Losio has a summary [on InfoQ](https://www.infoq.com/news/2026/05/bintrail-mysql-timetravel/).\n\nTeiva Harsanyi - [How Linux 7.0 Broke PostgreSQL: The Preemption Regression Explained](https://read.thecoder.cafe/p/linux-broke-postgresql).\n\nRadim Marek covers [the ORDER BY jungle](https://boringsql.com/posts/order-by-jungle/), as well as [PostgreSQL’s TOAST](https://boringsql.com/posts/postgresql-toast/).\n\nMarkus Winand also looks at [ORDER BY and the evolution of support in different RDBMS](https://modern-sql.com/blog/2026-05/order-by-history).\n\n🔥 James Blackwood-Sewell writes up details of [the benchmarking platform they built](https://www.paradedb.com/blog/what-we-think-about-when-we-think-about-benchmarking), whilst Ben Dicken muses on [benchmarking at PlanetScale](https://planetscale.com/blog/on-benchmarking) too.\n\nAn opinionated, and fairly concise, set of recommendations for the use of different [Open Standards for Modern Data Architecture](https://www.data-landscape.com/).\n\nLinkedIn’s Pratikmohan Srivastav writes about a performance troubleshooting experience - [The 58-Million-Key Freeze: What a HashMap Resize Taught Us About Memory Allocation at Scale](https://www.linkedin.com/blog/engineering/feed/the-58-million-key-freeze-what-a-hashmap-resize-taught-us-about-memory-allocation-at-scale).\n\nSem Sinchenko - [Same buffers, same instructions, same hardware. Where Is the JVM Tax?](https://semyonsinchenko.github.io/ssinchenko/post/jvm-tax/)\n\nGergely Orosz [shares some excerpts](https://newsletter.pragmaticengineer.com/p/designing-data-intensive-applications-book-excerpt) from Martin Kleppmann’s second edition of *Designing Data Intensive Applications*.\n\n*I warned you previously…this AI stuff is here to stay, and it’d be short-sighted to think otherwise.*\n\n🔥 Ben Evans - [AI Eats the World](https://www.ben-evans.com/presentations).\n\n🔥 TikTok is my guilty pleasure, but instead of dogs misbehaving in comical ways, here’s an excellent piece to camera from Scott Hanselman reflecting on [the impact of AI in our lives as software developers](https://vm.tiktok.com/ZNRW27cR2/).\n\nPro-tip:\n\n[works great with TikTok, so you don’t have to actually open the page if you still wanna view the video.]`yt-dlp`\n\nNate Berkopec - [Thoughts on LLMs in 2026](https://www.nateberkopec.com/blog/thoughts-on-llms-in-2026/).\n\nJulien Hurault - [Time for AI Coding to Turn Boring?](https://juhache.substack.com/p/time-for-ai-coding-to-turn-boring)\n\nKate Holterhoff - [AI Slop & the Vulnerability Treadmill](https://redmonk.com/kholterhoff/2026/05/05/ai-slop-vulnerability-treadmill/).\n\nPaulo Arruda - [What I Learned Building Multi-Agent Systems from Scratch at Shopify](https://www.infoq.com/presentations/multi-agent-system-lessons/).\n\n🔥 Loris Cro - [Contributor Poker and Zig’s AI Ban](https://kristoff.it/blog/contributor-poker-and-ai/).\n\nLucia Cerchie - [Why You Need More Than a SKILL.md](https://luciacerchie.dev/articles/why-you-need-more-than-a-skill-md/).\n\n*Nothing to do with data, but stuff that I’ve found interesting or has made me smile.*\n\n🔥 An oldie (2008!) but a goodie: Jeff Atwood - [Don’t Go Dark](https://blog.codinghorror.com/dont-go-dark/).\n\n🔥 Lara Hogan - [Be a thermostat, not a thermometer](https://larahogan.me/blog/be-a-thermostat-not-a-thermometer/).\n\nAs an IC, I endorse this pitch from Elena Verna ;-) [IC work is the new career flex](https://www.elenaverna.com/p/ic-work-is-the-new-career-flex).\n\n🔥 Ana Rodrigues - [It’s 2026 and women are still asked to teach others to think a little bit and not be a prick](https://ohhelloana.blog/woman-in-tech/).\n\nLeyla Kazim - [I did no work for a year and no one noticed](https://leylakazim.substack.com/p/i-did-no-work-for-a-year).\n\n🔥 Kevin Powell wrote [this article](https://www.kevinpowell.co/article/tell-someone-you-appreciate-them/) which resonated hard for me. I think it’s a boiling-frog situation; if I think about my motivation to write today, vs a year ago, vs 5, it’s definitely very different. AI noise drowns things out, kinda like SEO marketing 'content factories' did but on a bigger and more destructive scale, so as an author is it even worth writing original material? Is anyone even gonna read it?\n\nAn excellent writeup from Vicki Boykis about [Tagging my blog posts with BERTopic and LLMs](https://vickiboykis.com/2026/05/18/tagging-my-blog-posts-with-bertopic-and-llms/) - definitely need to try this.\n\nMike McQuaid - Open Source Resistance: [Keep OSS alive on company time](https://ossresistance.com/).\n\nVery cool idea for conference badges from Shy Ruparel, with [an excellent writeup](https://temporal.io/blog/badges-for-replay-and-i-havent-slept-since-december) to boot.\n\n*I couldn’t think of a good subheading for these :)*\n\nSome fun nostalgia, with screenshots of various old OSes at [typewritten.org](http://www.typewritten.org/Media/) and [The Virtual OS Museum](https://virtualosmuseum.org/more-screenshots/) (the latter even has, IIUC, runnable VMs for [download](https://virtualosmuseum.org/downloads/)!)\n\nDan Carlin is probably my favourite podcaster, and as well as his well-known [Hardcore Histories](https://www.dancarlin.com/hardcore-history-series/) he has occasional thoughts on more current affairs, including this one: [The Water in Which We Swim](https://pca.st/episode/5df3b4a2-666d-4a53-9741-6d46fc85d188).\n\n[The Middle Class Museum](https://www.ideagames.fun/middle-class-museum) (*A memorial to affordable living*).\n\n[Fast16: The Cyberweapon That Predates Stuxnet by Five Years](https://hackingpassion.com/fast16-pre-stuxnet-cyber-sabotage/).\n\nUnresolved directive in <stdin> - include::../../asciidoc-includes/il-footer.adoc[]", "url": "https://wpnews.pro/news/interesting-links-may-2026", "canonical_source": "https://dev.to/rmoff/interesting-links-may-2026-3o0p", "published_at": "2026-05-29 10:00:19+00:00", "updated_at": "2026-05-29 10:12:13.316661+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-products", "ai-tools"], "entities": ["Robin Moffatt", "Thomas Cooper", "BlueSky", "Hacker News", "LinkedIn", "Apache Kafka", "Sandon Jacobs", "Current conference"], "alternates": {"html": "https://wpnews.pro/news/interesting-links-may-2026", "markdown": "https://wpnews.pro/news/interesting-links-may-2026.md", "text": "https://wpnews.pro/news/interesting-links-may-2026.txt", "jsonld": "https://wpnews.pro/news/interesting-links-may-2026.jsonld"}}