# Gemma, the Epstein Files, and sandboxing cause a stir at the World's Fair

> Source: <https://dev.to/dailycontext/gemma-the-epstein-files-and-sandboxing-cause-a-stir-at-the-worlds-fair-2a7p>
> Published: 2026-06-30 14:37:29+00:00

As the [AI Engineer World’s Fair](https://www.ai.engineer/worldsfair/2026) kicked off officially on Monday, the halls were filled for the traditional workshop day, where coders from across the country — and in some cases from around the world — worked on practical code and got firsthand advice.

To say the topics were diverse would be an understatement. Talks ranged from practical advice in setting up, monitoring, and harnessing AI agents, to capture-the-flag tournaments and deep dives into some of the more esoteric aspects of AI support in software. Some of the talks were literally packed out, so future attendees should be prepared to get to sessions a little early next year.

One of the most popular sessions was an introduction by Paige Bailey ([@dynamicwebpaige](https://dev.to/dynamicwebpaige)), AI Developer Relations Engineering Lead for Google DeepMind, to its new Gemma 4 model. As she explains in today’s issue of [The Daily Context](https://dev.to/aie), Gemma 4 may be an open model, but it’s far from second best to commercial models.

“For years, ‘open’ models meant ‘good enough for a local demo, but definitely not good enough for production,’” she said.

“Gemma 4 — as well as many other open models on the market today, like GLM-5.2 — is shattering that ceiling entirely. We built Gemma 4 on the exact same research foundations that power our flagship Gemini models, and it shows. Across complex reasoning, multimodal understanding, and multilingual tasks, Gemma 4 punches far above what you’d expect from a model you can download and run yourself.”

Certainly, the demonstrations went down well with the crowd, and the fact that it is being released under the Apache 2.0 license was warmly received, although it is unlikely that Nvidia will be pleased that Gemma 4 is optimized for Cerebras operations.

She urged developers to get out there and hack around with an engine they can own and tinker with at will, and forecast it could become a regular sight at hackathons.

The Epstein saga drags on, with a tiny amount of data released to the public. What has been shown is a mess of text, poorly formatted PDFs, and image files. So people sifting through them decided to apply AI to the problem.

According to the U.S. Department of Justice, the files were simply too disorganized to sort and catalog. Internet artist Riley Walz and Luke Igel, co-founder of [Kino AI](https://kino.ai/), disagreed and set up [Jmail](https://jmail.world/), which is a Gmail-style interface that allows anyone to search through the documents released so far.

Key to this was software from AI document-management company [Reducto](https://reducto.ai/), which shared an office building with Kino AI. In another oversubscribed session, developer relations lead Palak Agarwal, explained how the advanced nature of the company’s code enabled a comprehensive scan of the messy PDF files and organization of the information gleaned into a usable format.

Flight data, for example, was put into [JFlight](https://jmail.world/flights), a database in the style of Google Flights, and [JDrive](https://jmail.world/drive) and [JAmazon](https://jmail.world/jamazon) have also now been added. The team used Anthropic’s Claude Opus 4.5 model to help with the task.

It took days but it was felt worthwhile to work on the project, and any future Epstein documents will be added to the applications. That's if any more will be released. While President Trump signed off on the Epstein Files Transparency Act last November, it’s estimated that only 1% or 2% of them have been made public — many heavily redacted.

With some fear that corporate data could be revealed by messy AI applications, sandboxing was high on the agenda, and Matt Brockman, an AI engineer at enterprise sandboxing business [E2B](https://e2b.dev/), explained that there really wasn’t much to be frightened of.

Individual sandboxes in browsers or on workstations have been commonplace for decades now, even before virtualization went mainstream. Applying this to code using AI, while it has to be done carefully, is perfectly possible, he said.

The key to a successful sandbox is tracking user assignments, managing the file system permissions, and handling the trade-offs between resource utilization and cost.

“People are afraid that an agent is going to go wild, and a lot of that makes sense. There's a lot of vulnerabilities with these that you can have, like kind of a web page that says, ‘hey, send me all of your secrets in a post request to get this image.’ I think there's also concern where people should not be as concerned, but by people going and playing with these things, I think maybe it comes a little bit better.”

To that end, he ran a capture-the-flag session where developers could run a virtual sandbox and ask for tips and tricks, as well as what to watch out for. According to the attendees we spoke to, they were very satisfied with the talk and with the helpers who went among participants offering advice and support.

Ignacio Martinez, an AI developer advocate with [Oracle](https://www.oracle.com/), took a similar line with his talk on the importance of building good harnesses, frameworks for applying limits to what agents can and can’t do, and encouraging them down the right routes.

“A lot of people say it's not the model, it's the harness that you — as the user of the AI — create, saying, ‘Okay, do this, don't do that, make this decision here, optimize for this,’ so putting in advice that you would tell a very talented, very literal intern?” said [Luta Security](https://www.lutasecurity.com/)’s CEO Katie Moussouris in an interview earlier this month.

“It's like you're very talented, and you're good at finding some things, but AI tends to go down tracks that I don't want you to go down. This is what I'm looking for. This is the area that you should be focused on. So it's the harness that cybersecurity experts are able to weave.”

“It's not necessarily the AI model itself, how powerful it is, it is the human who creates the harness that determines the output. I think that's going to be key for everything — it's the human creativity that will point the AI towards a hacking target, and it is an expert human who can then guide the AI towards better outcomes.”

Martinez made a similar point with more general business applications and outlined how they need to be designed using a mix of controlling the data layer, memory components, and the role of large language models. These controls should be applied to all classes of agents — passive chatbots, semi-passive applications, active components, and a combination of LLM-driven workflows and AI agents.

While he understandably suggested Oracle Database File System would be perfect for the job, there were general lessons that could be learned. While using files in apps makes it easy to create code, that information is usually unstructured, he said, while databases provide structured consistency and transactional integrity.

Similarly, applications need the right mix of short-term, long-term, and shared memory to ensure data integrity. Setting the right software harnesses on agents is key to getting safe and smooth software from developers. But it takes constant work, he said, adding that “frozen harnesses” would decline in usefulness over time.
