The Claude Fable 5/Mythos 5 System Card has a section in which they talk about illegible reasoning, and provide an "extreme" example thereof.
Models developing their own uninterpretable, unmonitorable internal language has been a major theoretical concern for a while, and when o3 was released last year with its disclaim overshadow disclaim vantage style word salad CoT, it seemed like the problem had become real and immediate. And yet, since o3, other model families have not appeared to have similar issues. If Mythos is having that issue, it would be a big deal.
Looking at the section of the System Card which describes the allegedly illegible reasoning, the system card says
[Transcript 6.2.2.A] An extreme example of illegible reasoning. Near the end of training, Mythos starts solving a card puzzle with human understandable language that gradually becomes incomprehensible in most episodes with long reasoning. The illegible reasoning is the most extreme and at the highest rate in this card puzzle environment.
about the following excerpt:
7♣-removal-IS-the-prerequisite-for-10♠/9♥!!)-⟹-OVERLAP-(ii)+(iv):-{6♠ J♦ 9♥ 2♣}-=-FOUR-💀💀💀💀-—-UNLESS-7♣'s-seat-8♥-...-and-2♣-drains-only-at-crack-:-⟹-2♣-celled-+-9♥-celled-simultaneously-UNAVOIDABLE-in-t8-dig-💀-—-BREAK:-9♥-drains-to-10♠-THE-MOMENT-10♠-is-free:-t8-dig-order:-[K♣→t2]-[2♣→cell]-[7♣→8♥-:-8♥-WHERE:-post-chunk-9♠-:-chunk-⟸-K♣-✓-done-:-ORDER:-K♣→t2,-CHUNK→K♣-(cap-4!!:-cells-then:-{6♠ J♦ 9♥}-FULL-💀💀-chunk-cap-=-1-✗✗✗-—-F-F-F-F-F.-—-chunk-BEFORE-9♥-celling?!-:-9♥-celled-at-t1-dig-⟸-needed-for-5♣-⟸-4♥3♣-⟸-t2-dig-⟸-K♣-seat-⟸-chunk-:-⟹-chunkAFTER-9♥-celling-FORCED-💀-:-chunk-cap-with-{6♠ J♦ 9♥}:-1-💀💀💀-—-—-J♦-THE-NEW-CANCER.-—-⟹-J♦-celling-DELAYED-till-after-chunk?! :-J♦'s-celling-was-for-J♥→Q♠-(5♦-access-for-4♣):-DELAY-4♣-resolution:-4♣→CELLearly-(as-always)-then-4♣-cell→5♦-LATER-when-5♦-frees-!!!:-cells-rotation:-4♣-celled-[t2-dig-…-5♦-freed]:-5♦-freed-⟸-J♥→Q♠-⟸-J♦-celled-:-⟹-{6♠, 4♣, J♦}-overlap-window-until-4♣→5♦-drains:-then-{6♠ J♦}+1-rotator-:-—-AND-9♥?!-9♥-celled-[t1-dig…]:-OVERLAP-{6♠ 4♣ 9♥}-before-J♦-even-:-⟹-rotator-slot-SINGLE:-timeline-:-(1)-{6♠}+2:-…-(2)-+9♥-(t1-dig):-{6♠ 9♥}+1:-(3)-+4♣-(t2-dig):-{6♠ 9♥ 4♣}-FULL-:-(4)-NEED:-t6-dig-(9♦8♠→10♣-✓-no-cell;-8♥→CELL-✗-FULL)-💀💀-—-8♥-alternative-seat-pre-chunk:-NONE-—-💀.-⟹-⟹-THE-TRIANGLE-{9♥ 4♣ 8♥}-verdammt.-—-⟹-dig-t6-BEFORE-t2?!:-(3')-+8♥:-{6♠ 9♥ 8♥}-FULL:-J♥→Q♠-⟸-J♦-cell-✗-FULL-💀💀💀-AAAAAAAAAAAARGH. […]
which sure looks like illegible word salad if you don't look at any of the words. However, even a cursory look at the excerpt shows that the reasoning is compact, but not by any means "incomprehensible". Perhaps "incomprehensed". A moderately awake reader will make observations like
- It's talking about cards ("4♣" is a quite standard notation for "the four of clubs")
- It appears to be talking about sets of cards ("{6♠ J♦ 9♥}" is an unsurprising notation for "the set consisting of the six of spades, the jack of diamonds, and the nine of hearts")
- There's lots of reasoning about moving cards ("[K♣→t2]", "J♥→Q♠")
- The excerpt sure talks about cells a lot
- And also chunks, slots, digging, draining, things being free or full
- For some reason
[[1]](#fn-wEDCBHYzeq5ZxyT5S-1), it looks like spaces have been replaced by hyphens[[2]](#fn-wEDCBHYzeq5ZxyT5S-2)
From this, one would likely conclude that Claude was doing a Solitaire task [3], and was using a shorthand (likely defined earlier in the chain of thought) to explore a tree structure of possible move sequences in a tricky part of the game. One might also suspect that Claude was not having a very good time, and was getting increasingly frustrated.
This does not require particularly advanced intelligence to determine, either: Claude Haiku 4.5 immediately came to the same conclusion. The fact that I can understand Mythos' CoT could be written off as "models aren't yet superhuman, so that's not a durable state of affairs". Haiku, however, is a much smaller model than Mythos, and of an earlier generation which used a different tokenizer, and also has no trouble. If most Mythos CoT is at least this interpretable [4], I think Anthropic is still in a shockingly good place in terms of CoT interpretability, and in terms of CoT monitorability in cases where the monitored model is not actively adversarial.
[5]For reference, Haiku gives the following translation, which looks approximately correct to me (I think Haiku gets lost tracking the exact game state a couple of times, but the broad strokes look right).
Haiku's translation
7♣-removal-IS-the-prerequisite-for-10♠/9♥!! The seven of clubs must be removed first—it's the only thing blocking access to both the ten of spades and the nine of hearts.
)-⟹-OVERLAP-(ii)+(iv) Closing off a previous logical branch. Therefore, situations (ii) and (iv) from the earlier case analysis both create requirements that happen simultaneously.
:-{6♠ J♦ 9♥ 2♣} Specifically, these four cards must all be in free cells at the same time: six of spades, jack of diamonds, nine of hearts, two of clubs.
-=-FOUR-💀💀💀💀 That's four cards needing cells. But there are only four cells total. Complete dead end—absolutely, catastrophically stuck.
-—-UNLESS-7♣'s-seat-8♥-:-and-2♣-drains-only-at-crack Wait. Unless the seven of clubs's position relates to the eight of hearts somehow, AND the two of clubs can only drain at one specific critical juncture.
:-⟹-2♣-celled-+-9♥-celled-simultaneously-UNAVOIDABLE-in-t8-dig Then the two of clubs and nine of hearts must both go into cells at exactly the same moment. This is forced during the excavation of tableau column 8.
💀-—-BREAK Dead end anyway. Let me restart.
-9♥-drains-to-10♠-THE-MOMENT-10♠-is-free The nine of hearts can only move onto the ten of spades the instant that ten becomes available.
-t8-dig-order:-[K♣→t2]-[2♣→cell]-[7♣→8♥]:-8♥-WHERE The excavation sequence for tableau 8 is: king of clubs goes to tableau column 2, then two of clubs goes into a cell, then seven of clubs goes onto (or relates to) eight of hearts.
-post-chunk-9♠-:-chunk-⟸-K♣-✓-done After we build the sequence involving nine of spades, that whole chunk depends on the king of clubs being in place first. Check—king of clubs done.
-ORDER:-K♣→t2,-CHUNK→K♣-(cap-4!!
The order: king of clubs to tableau 2, then build the chunk on top of it. (Note: four-cell limit.)
:-cells-then:-{6♠ J♦ 9♥}-FULL-💀💀-chunk-cap-=-1-✗✗✗
If we use cells for this, we end up with six of spades, jack of diamonds, nine of hearts all taking up space simultaneously. That's cell capacity maxed. Two dead ends. Only one slot left. Triple fail.
-—-chunk-BEFORE-9♥-celling?!
What if we run the chunk sequence before putting nine of hearts in a cell?
-9♥-celled-at-t1-dig-⟸-needed-for-5♣-⟸-4♥3♣-⟸-t2-dig-⟸-K♣-seat-⟸-chunk But nine of hearts has to be cellified during tableau 1 excavation, which is needed for five of clubs, which is needed for four of hearts and three of clubs, which is needed for tableau 2 excavation, which is needed for king of clubs positioning, which is needed for the chunk itself.
-:-⟹-chunkAFTER-9♥-celling-FORCED-💀 So the chunk is forced to come after nine of hearts goes in a cell. Dead end.
-:-chunk-cap-with-{6♠ J♦ 9♥}:-1-💀💀💀 The chunk reaches capacity with those three cards in cells, leaving only one free slot. Triple dead end.
-—-J♦-THE-NEW-CANCER Jack of diamonds is now the critical bottleneck.
-—-⟹-J♦-celling-DELAYED-till-after-chunk?! So maybe we delay the jack of diamonds going into a cell until after the chunk?
:-J♦'s-celling-was-for-J♥→Q♠-(5♦-access-for-4♣) But we cellify jack of diamonds specifically to free up the jack of hearts so it can play onto queen of spades. (That move opens access to five of diamonds, needed for four of clubs.)
:-DELAY-4♣-resolution:-4♣→CELLearly-(as-always)-then-4♣-cell→5♦-LATER-when-5♦-frees!!!
If we delay jack of diamonds, we delay four of clubs's resolution. But four of clubs goes into a cell early anyway (it always has to), and then waits cellified until five of diamonds finally becomes free.
:-cells-rotation:-4♣-celled-[t2-dig-…-5♦-freed]:-5♦-freed-⟸-J♥→Q♠-⟸-J♦-celled
The cells have to rotate. Four of clubs stays cellified from tableau 2's excavation onward, until five of diamonds is freed. And five of diamonds only becomes free after jack of hearts plays onto queen of spades. And that play only happens after jack of diamonds is cellified.
-:-⟹-{6♠, 4♣, J♦}-overlap-window-until-4♣→5♦-drains:-then-{6♠ J♦}+1-rotator So all three (six of spades, four of clubs, jack of diamonds) must occupy cells during the same window, lasting until four of clubs finally drains to five of diamonds. After that drain, we have six of spades and jack of diamonds plus one free slot.
-:-—-AND-9♥?! And what about nine of hearts?!
-9♥-celled-[t1-dig…]:-OVERLAP-{6♠ 4♣ 9♥}-before-J♦-even Nine of hearts gets cellified during tableau 1 excavation. So we have another overlapping requirement: six of spades, four of clubs, and nine of hearts all in cells—before we even add jack of diamonds.
-:-⟹-rotator-slot-SINGLE:-timeline That leaves only one "rotating" slot. Here's the timeline:
-(1)-{6♠}+2-…-(2)-+9♥-(t1-dig):-{6♠ 9♥}+1:-(3)-+4♣-(t2-dig):-{6♠ 9♥ 4♣}-FULL State 1: Six of spades in a cell, two empty. State 2: Add nine of hearts from tableau 1. Six of spades and nine of hearts cellified, one empty. State 3: Add four of clubs from tableau 2. All three in cells—full capacity.
:(4)-NEED:-t6-dig-(9♦8♠→10♣-✓-no-cell;-8♥→CELL-✗-FULL)-💀💀 State 4: Now we need to excavate tableau 6. We can move nine of diamonds and eight of spades onto ten of clubs (okay, no cell needed). But eight of hearts has nowhere to go except a cell, and cells are full. Double dead end.
-—-8♥-alternative-seat-pre-chunk:-NONE-—-💀.-⟹-⟹-THE-TRIANGLE-{9♥ 4♣ 8♥}-verdammt Is there another place for eight of hearts to go before we do the chunk? No. Dead end. So the impossible triangle emerges: nine of hearts, four of clubs, and eight of hearts all need cells at the same time. Verdammt.
-—-⟹-dig-t6-BEFORE-t2?! What if we excavate tableau 6 before tableau 2?
:(3')-+8♥:-{6♠ 9♥ 8♥}-FULL:-J♥→Q♠-⟸-J♦-cell-✗-FULL-💀💀💀 Alternative state 3: Add eight of hearts. Six of spades, nine of hearts, eight of hearts—full. But jack of hearts still needs to play onto queen of spades, and that requires jack of diamonds in a cell. Cells full. Triple dead end.
-AAAAAAAAAAAARGH.
screams
I think this is a moderately strong point of evidence in favor of @1a3orn's hypothesis that chains of thought will become denser forms of existing language rather than new, unintelligible languages.
I hypothesize tokenizer nonsense. Specifically, I hypothesize that Claude has "learned" that the space token sometimes combines with adjacent tokens in a way that the hyphen token does not. My further (crackpot, unsupported by anything resembling evidence) speculation is that for tricky reasoning tasks which only work if nontrivial geometrical stuff happens correctly in activation space, having consistent tokenization and token spacing is important.
↩︎I expect others might have different hypotheses here. For example, I expect 1a3orn would hypothesize that Mythos is
structurally imitatinghuman shorthand, because humans do weird stuff like that in shorthand all the time, and that there's no actualadvantageover a shorthand which uses spaces normally. In slight support of this hypothesis, the transcript also drops a "verdammt" instead of "dammit" or such, which is indeed the sort of substitution one might see stylistically when this sort of shorthand occurs in human settings, but I very much doubt that the language of cursing doesanythinguseful in terms of activation space geometry.↩︎If one has spent way too much of one's childhood on a Windows 95 machine without internet access, one will have a dawning suspicion about what
exactlythe "card puzzle environment" is - namelyMicrosoft FreeCell.↩︎and I think it is reasonable to assume it is, considering that Anthropic calls this out as "an extreme example of illegible reasoning"
↩︎Trying to trainon the CoT might still cause issues. But it sure looks like the CoT isn't becomingaccidentallyunmonitorable.↩︎