This is a submission for the Hermes Agent Challenge.
My Hermes research agent was failing after about 40 turns. The cause: conversation history growing past the context window. The fix everyone reaches for is "just drop old messages" — but if you drop a tool_use
without its matching tool_result
, Anthropic's API rejects the whole request.
I needed something smarter. That's agent-message-trim
.
from agent_message_trim import trim_messages
result = trim_messages(messages, max_tokens=4000)
response = client.messages.create(
model="claude-sonnet-4-5",
messages=result.messages,
...
)
print(f"Dropped {result.dropped_count} messages to fit")
This is the part that matters. If your history looks like this:
[
{"role": "user", "content": "search for X"},
{"role": "assistant", "content": [{"type": "tool_use", "id": "call_001", ...}]},
{"role": "user", "content": [{"type": "tool_result", "tool_use_id": "call_001", ...}]},
{"role": "assistant", "content": "Here is what I found."},
]
trim_messages
never drops the tool_use
without also dropping its tool_result
. They move as a unit. The conversation you get back is always API-valid.
result = trim_messages(messages, max_tokens=4000, keep_system=True)
result = trim_messages(messages, max_tokens=4000, strategy="drop_oldest")
result = trim_messages(messages, max_tokens=4000, strategy="drop_middle")
drop_middle
is useful when you want to keep the original task context AND the most recent exchange, but can sacrifice the middle of a long conversation.
The built-in estimator is max(1, (len(text)+3)//4)
. Plug in your own:
import tiktoken
enc = tiktoken.encoding_for_model("gpt-4o")
result = trim_messages(
messages,
max_tokens=4000,
count_tokens=lambda text: len(enc.encode(text)),
)
result = trim_messages(messages, max_tokens=4000)
result.messages # trimmed list
result.token_count # estimated tokens used
result.original_count # how many messages came in
result.dropped_count # how many were removed
result.ok # True if nothing was dropped
result.kept_count # len(result.messages)
python
from agent_message_trim import trim_to_fit
trimmed = trim_to_fit(messages, max_tokens=4000)
Standard library only: json
, dataclasses
. Nothing else.
pip install agent-message-trim