Effective use-cases for LLMs

A software engineer shares effective LLM use cases including searching customer conversations via RAG, triaging API endpoint alerts with an agent harness, and shortening long-form content. These applications cut triaging time from 15+ minutes to 1-2 minutes and enable evidence-backed product proposals.

There’s a lot of talk about the shortcomings of LLMs. They don’t actually reason. They’re expensive, especially when running in a loop. They’re quite slow at doing things. There’s a narrow category of use cases that LLMs excel at, one of which is “sifting through the noise”. The noise is everything we have to process to get to what we really want. Here are some use cases I haven’t heard about that I’ve enjoyed as a software engineer. Searching through Customer Conversations A PM colleague uploaded the transcript of every call with our top customers into an Embedding DB. Now their product proposals are deeply backed by evidence. We know 40% of our top customers have mentioned this pain point. The PM also identified a list of eager private beta customers to try out our new feature. This is useful when the customer’s problem is abstract. Often, these issues don’t have clear solutions, or those solutions don’t have clear names. That makes filing Feature Requests hard, and organizing/deduping even harder. Before LLMs, your best bet was that someone on your team had enough tenure to have seen this come up enough times, and that they remembered how to find all the links and connections. Now, it’s RAG. Going from endpoint alert - log analysis Any large system is going to be operating most of the time in failure mode. — John Gall, via Lorin Hochstein, Netflix When I’m on-call, one of my responsibilities is to triage failures on API endpoint out team owns. These failures are reported as “high rate of HTTP 4XX/5XX”. Sometimes, it’s noise, like there’s a DB connection hiccup for the pod. Other times, it’s signaling a bug, like customers can’t delete something anymore. Triaging is tedious: - The first step is searching for the canonical log lines https://stripe.com/blog/canonical-log-lines that mark the specific endpoint with the specific HTTP failure, filtered by time. - Once I find the request that triggered the alert, I search by request ID, to see the request from start to end. Based on the logs, and my source code, I can usually guess what went wrong. - Sometimes the stack trace is compiled JavaScript, rather than Typescript, so the line numbers don’t line up. I have to guess based on the name of the next function call. - I double-check that I’m looking at a representative request. I quickly look at two or three more request IDs to make sure they’re all the same root cause. - For more difficult issues like DB connection timeouts, I’ll see if there’s clustering on the canonical log lines around timestamp, host machine, customer ID. Maybe it’s not specifically my route, but an infra issue. All in all, there’s a lot of stuff to sift through. There’s so much judgment required, and I haven’t even found the problem, let alone thought about a solution yet. Yet, an agent harness is almost perfect for this. Given some alert and timestamp, point me in the right direction: logs, source code, or clustering. This has cut my triaging time from 15+ minutes to 1-2 minutes per issue. You don’t even need the SotA $$$$ models. Save your money, use a faster model. I published this workflow as a skill for my teammates with the intention of sharing the actual human skill involved. The output names all the queries it tried, categorized into informative or non-informative, with links to dig deeper. I don’t want it to be magical, because I want my teammates to know how to think about triaging. I also want it to be a ramp to independent discovery. Shortening Content I specifically didn’t call this summarizing because: ChatGPT doesn’t summarise. When I asked ChatGPT tosummarisethis text, it insteadshortened the text.— https://ea.rna.nl/2024/05/27/when-chatgpt-summarises-it-actually-does-nothing-of-the-kind/ But despite all that, I still find incredible value in shortening texts I’ll sometimes get recommended a podcast or video that’s over 1 hour long. Sometimes, I’m hooked within the first 5 minutes. But for technical content, my interest is often buried deep in the video, maybe 30 minutes in for recorded talks. I don’t want to spend that much time figuring out if something is interesting to me, and LLMs greatly help with that. In my experience, if there’s enough interesting content in the shortened version, there’s plenty in the unshortened version. One video casually mentioned east-coast vs west-coast programming in the US https://youtu.be/I7fEsbksKRE?t=197 . Without shortening, I would have stopped watching 19 seconds earlier out of disinterest. Transcribing Okay, summarizing is really useful to me, but how do I get it to work on videos and podcasts? I made myself a little automation that, given a link, will check: - If there are subtitles, download that - If it’s a video, download the audio for transcribing - If it’s audio, transcribe Once I transform slow video or audio formats into text, I can summarize I say all this with the caveat that maybe this is a coping skill for my ADHD. Attention is hard for me to maintain, specifically for audio. I literally have a test result saying I’m in the bottom 1% for auditory focus, consistency, AND stamina. These are three separate skills, and I’m statistically awful at all of them. So maybe what matters the most is the ability to transform audio into text, since I’m able to process text much better than audio though statistically still average . Instagram Video Search This is a dream of mine, but I want to index every video I’ve hearted on Instagram. I want to OCR the subtitles on the screen, transcribe the audio from the video, and object-detect from the thumbnails. I want this because it’s so damn hard to find that one video I liked from 3 years ago. It’s just floating around somewhere in the internet, on Instagram or TikTok. I have no idea how to find it. But I think embeddings can https://technicalwriting.dev/2024/10/embeddings/index.html . Sure, Google wants to “organize the world’s information”. I appreciate that I can search “horse” on the Apple Photos app and get all the horses I’ve seen. But I want it for my memes, and I’m shocked at how behind Instagram is on this. Closing That’s all I got. I’m still very on the fence about LLMs. I enjoy using my open-weight models. I’m excited to see work on making inference affordable. And I’m terrified of their impact on our economy, social fabric, and individual psyche. But I got some fun goodies out of it https://matthewbutterick.com/extinction-level-capitalism.html . Yes, the planet got destroyed. But for a beautiful moment in time we created a lot of value for shareholders.