A Call for Better Type Hints in AI Safety Tooling

wpnews.pro

cd /news/ai-safety/a-call-for-better-type-hints-in-ai-s… · home › topics › ai-safety › article

[ARTICLE · art-16994] src=lesswrong.com ↗ pub=2026-05-28T23:04Z topic=ai-safety verified=true sentiment=· neutral

A Call for Better Type Hints in AI Safety Tooling

Researchers and developers in AI safety are calling for improved type hinting in Python-based AI safety tooling, citing evidence that static typing reduces bugs and improves code maintainability. A 2024 study by Blinn et al. found that AI agents using static type checkers achieve better results, with types helping to "tame hallucinations" in LLM-generated code. The push for better type hints comes as AI safety libraries like Inspect demonstrate effective implementation, while others such as TransformerLens and HuggingFace datasets suffer from underspecified type annotations that create footguns for beginners.

read4 min views10 publishedMay 28, 2026

Good type hints lead to code that is more maintainable, easier to understand, and with fewer bugs. If you'd like a quick, general intro into why, see this article, but suffice it to say that types give us a way to automatically check assumptions and invariants [1]. There are ways to go further (see "Formal Methods", including the

What's that?

Ah, I see...

I come from a background where I used TypeScript frequently. TypeScript has, in my opinion, the best type system of any mainstream programming language by far. Python's type system isn't as good, but it isn't horrible either. We have the tools to do better than this! And to be clear, some AI Safety libraries do this well! Inspect is a great example. More should follow their lead.

Most common objections to static typing are well addressed in the article I referenced earlier, but there are a couple objections specific to AI Safety:

The idea here is that since AI can understand much larger sections of the codebase, we no longer need to ourselves for the shape of our data in the absence of types to tell us. We can just have the AI do it for us! But there is some evidence pointing in the opposite direction. A 2024 paper by Blinn et al. argues that "AIs need IDEs too", and that AI agents using static type checkers get better results. Types can "tame hallucinations" and provide the hill-climbing feedback that LLMs need to be successful at coding. Some have found that type hints lead to easier code reviews and more maintainable AI-generated code.

The pushback here is in two parts:

For number 1, what if you're just hacking something together that isn't going to make it into the final published repo? Won't types just slow you down then? In that case, yes, you may decide that full, well-specified types aren't worth it. But if you're planning to reuse any of the code, really at all, you'll probably end up being faster in the long run if you add good types.

For number 2, published research code-bases shouldn't be though of as one-off. Wolter and Veeramacheneni argue that good software engineering practices would benefit the ML research community through easier reproduction, and I would add, extension. Good types make it much easier for researchers that come after you (or even yourself, a few months later, or your coding agent) to understand what is going on in the codebase and reuse what you've done. Otherwise, we risk wasting a lot of valuable researcher time! The ultimate example of this are packages that are explicitly designed to be reused. If nothing else, these packages should be well typed!

While HuggingFace's libraries aren't AI Safety specific per se, they are very commonly used in AI Safety research, and I've found them to have particularly bad type hints. For example, HuggingFace's Dataset

class isn't generic. It doesn't tell us anything about the shape of the data in the class! Some parts of the Dataset

interface are difficult to type correctly with Python's type system (such as indexing on a column name), but others are relatively straightforward (such as indexing on a row, iterating the dataset, or using .map

, mentioned earlier). I was frustrated enough by this that I've created a small package that wraps some common functions and methods from datasets

, making them generic over a row TypedDict

: https://github.com/Plyb/typed-datasets. It provides an escape hatch of going back to plain hf datasets for cases that aren't easily handled, but this is much better than nothing!

Implementing good type hints for your code will speed up AI Safety research and make it more trustworthy. We are doing ourselves a huge disservice when we leave this powerful tool on the table. What can you do? At the bare minimum, annotate function parameters with basic types. Is this a dict

or a tuple

? Going further would be to specify the contents of compound types (see TypedDict

) or making your functions and classes generic. Finally, if you're planning on anyone else using your code in the future (including yourself!), include types that are as specific as you can get them (such as using @overload

keyed on Literal

flag parameters).

source & further reading

lesswrong.com — original article Our response to Séb Krier on Plan A Making Credible Deals With AI Posting Some Prompts

~/api · this article 200

$curl api.wpnews.pro/v1/news/a-call-for-better-type-h…

Read original on lesswrong.com → www.lesswrong.com/posts/XPguz4hfjgovk4JHX/a-call…

mentioned entities

TransformerLens

MACHIAVELLI Benchmark

TransformerLensOrg

aypan17

metadata

sluga-call-for-better-type-hints-in-ai-safety-tooling

topic#ai-safety

secondary4 topics

sentimentneutral

canonicallesswrong.com

navigation

← prevCan 'honesty' give Claude Opus 4…

next →Embodied cognition and agentic A…

── more in #ai-safety 4 stories · sorted by recency

arxiv.org · 14 Jul · #ai-safety

Adversarially Guided Diffusion for LiDAR Range Image Synthesis

arxiv.org · 14 Jul · #ai-safety

AuditWeave: A Tamper-Evident, Auditor-Navigable Evidence Layer for AI-Assisted and Data-Transformation Workflows

arxiv.org · 14 Jul · #ai-safety

Gauge dependence and structured-output corruption in sign-branched repetition penalties: measurements across models, inference stacks, and alternative repetition controls

discuss.grapheneos.org · 14 Jul · #ai-safety

Ongoing changes to Android security patches due to AI vulnerability discovery

── more on @transformerlens 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 8 Jul · #artificial-intelligence

SpaceXAI unveils Grok 4.5 AI model ahead of July 2026 public release

wpnews · 8 Jul · #large-language-models

Gemini 3.5 Pro Delayed to July 17: Architectural Rebuild Explained

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required