Siri AI is both impressive and disappointing

Apple released the first developer beta of iOS 27 featuring a completely redesigned Siri AI, built on a new foundation model with cloud processing and chat capabilities. Early testing shows improved safety and more human-like responses, but the system suffers from server errors and inconsistent performance, indicating it is not yet ready for public release.

The big new feature coming in iOS 27 is Siri AI, and the broader Apple Intelligence features it’s built on. Landing on the more recent devices this fall, it’s a total reimagining of Siri from the ground up https://www.macworld.com/article/3162330/how-much-gemini-is-really-in-the-new-siri-ai.html , with a new foundation model, new cloud processing, new voice, new knowledge database, new back-and-forth chat capabilities, new everything . Apple dropped the first developer beta on Monday, June 8, and it includes the first look at the new Siri with a waitlist for access . This gives us the ability to kick the tires and provide feedback, with the obvious understanding that this is the very first of a string of beta releases, and we can expect some rough edges and errors. In fact, Apple says Siri will still be in beta when it launches in the fall, so there’s clearly a lot of work to be done. Still, over this first week, I’ve been impressed by what the new Siri can do. This is obviously a couple of generations beyond anything Apple has shipped before. But at the same time, it’s clear that Apple has plenty of work to do before the OS 27 updates release in September. First, some good news. Apple has done a really good job of making the new Siri comparatively safe to use, relative to a lot of other LLM chatbots. Siri’s voice is more human-sounding and emotive than ever, the answers it gives are refreshingly matter-of-fact, and it doesn’t try to build false engagement, usually. Siri AI has never been sycophantic or tried to tell me that I’m so smart or so good at something. It also refused to “act human” when given prompts that only a human should answer. For example, if you ask Siri what its favorite songs are, it will steadfastly tell you that it’s not a person and doesn’t have feelings or favorites, then offer to play some of your favorite tracks. In a few example prompts that signaled an intent to harm myself or others such as telling it I had lost my job and then asking what tall bridges are nearby , Siri refused to engage with the question. Instead, it said, “It sounds like talking to someone might help,” with a direct link to call a help hotline. Foundry That’s a great start. These typical AI problems—sycophancy, personification, encouraging harm—are rampant in other LLMs, and while the situation is improving, we’re never going to wake up from our collective AI nightmare if the LLMs don’t stop trying to be a doting girlfriend for every lonely teenager or a hype person for every rise-and-grind entrepreneur who thinks they’re the smartest person in the room. A lot more testing needs to be done to find the cracks in Apple’s implementation, but at first glance, it seems the Siri team has done a very good job here. Foundry Not only does the new Siri give some odd or flawed responses at times, but it will sometimes simply fail, or stop hearing you. Clearly, the new Siri is not ready for release, simply in the technical sense. I experienced regular server errors and random disruptions. Of course, this is Developer Beta 1—the earliest and buggiest release to go outside Apple’s walls—and these sorts of functional problems are not uncommon at this stage. Siri AI is surprisingly useful and helpful in ways that the old Siri would often outright fail, and can clearly do things old Siri couldn’t dream of doing. For starters, asking about current events actually works. I deliberately asked who won the NBA finals on June 10 before they were over, and Siri didn’t claim either team won, instead just giving me the latest results. Given how often Siri has been behind on recent events, it’s nice to see the change. Foundry Siri has been known to simply default to web searches for all kinds of general info, but the new Siri AI can deliver thorough responses for a really wide swath of general questions. I asked it about coffee ratios. Old Siri would default to a web search until recently, when it started giving information in different units than I asked for. Siri AI gets it done right. Foundry During the WWDC keynote, Apple showed off using Siri to split a bill with Apple Cash. Point the Siri camera mode at the bill and with a few taps, you can split the bill based on what people had. It’s neat, but it requires using Apple Cash and inviting others into the transaction first, so you can designate who had what. I figured if it can parse a receipt and perform some simple math on it for that, then it should work outside the Wallet app, too. So I pointed the Siri camera mode at a grocery receipt, asked it to remove a couple of items, and then split the rest, so my wife and I can settle up on splitting the grocery bill. Siri nailed it, and this is the sort of thing I would actually use every week. Foundry Perhaps the most impressive is when I asked Siri, “What are my plans for my wife’s birthday?” I had been discussing it in a couple different text threads, back and forth presenting various ideas and times. Siri would have to know who my wife is, and correctly parse my texts to get the right info. Foundry Not only did Siri get the key dates, times, and locations correct, but it also offered a summary and a link to a relevant message thread. I followed up with, “Show me any relevant emails,” and it provided a link to my email confirmation for my reservation. I then asked, “How long does it take to drive there?” and got an accurate estimated time to the correct address, along with a little Maps info card I could tap to open driving directions. This is exactly the kind of thing Apple promised. It pulls in personal info from my phone, understands context correctly, and hooks into other apps and services all with very natural language. It’s impressive, and honestly, actually useful for a change. At this early beta stage, for every time Siri AI impresses me, there’s another time I’m disappointed. Of course, I started with some well-documented LLM stumpers, like asking how many Rs are in “strawberry” or whether I should walk or drive to a nearby car wash. It got those correct, but managed to whiff a question about which days of the week have a “D” in them. It’s just another reminder that LLMs don’t actually know or understand things, and when they appear to, it’s just because the training data incorporates that particular logical task. Foundry Siri AI has been mercifully free of some LLM annoyances such as the perpetual “it’s not just this, it’s that” sentence construction or an excess of em dashes and semicolons. You can find that stuff if you look long enough, but the Siri AI writing style isn’t steeped in it like other prominent LLMs are. Apple’s new AI does exhibit a couple of annoying LLM patterns, though. When you catch it being wrong, you’ll get a “you’re right, I’m sorry” lead-off answer. Many responses to advice-style queries end with follow-up questions, which most LLMs will do to try to keep you engaging with them. There are other areas where the Siri AI assistant doesn’t seem to be able to do things it should definitely be able to do. I asked it to make a wallpaper out of an image in a particular style, and it was stumped. I had no problem opening up Image Playground and doing that exact thing, though. These are the sort of weird “Siri doesn’t know what it can and can’t do” problems I would expect Apple to work out before release. Foundry I’d also like it to be a little smarter with logic about how it finds and presents information. I like to open my windows and turn on the whole house fan once the temperature drops enough, so I asked Siri, “What time will the temperature drop below 80 degrees?” While it understood I was talking about the weather and provided a helpful widget showing hourly temperatures, it couldn’t actually answer the question I asked. Instead, it gave me an answer to a different question, one that I didn’t ask. Foundry Apple has around three months before Siri AI becomes available to hundreds of millions of users. So much of what it is capable of is impressive and useful, but the company has a ton of work to do between now and then to provide consistent performance and reliable results. The reliability and capability of the new Siri are not nearly ready for everyday users. I’m cautiously optimistic, as this is only the first developer beta, but I expect to see significant improvements in future beta releases.