{"slug": "a-drop-in-replacement-chat-template-for-google-gemma-4-31b-it-tuned-for-open", "title": "A drop-in replacement chat template for google/gemma-4-31B-it tuned for open-source agentic coding harnesses.", "summary": "A developer has released a public fork of Google's Gemma 4 chat template that fixes four critical bugs affecting open-source agentic coding harnesses. The fork addresses issues including corrupted tool call arguments from nested JSON braces, dropped prior-turn reasoning, disabled thinking mode, and Python \"None\" strings appearing in place of JSON null values. The patched template ships with a conformance test suite and is designed for use with harnesses like OpenCode and Pi.", "body_md": "| {#--------------------------------------------------------------------- | |\n| custom_pub_chat_template_gemma4.jinja | |\n| ===================================== | |\n| A public, harness-friendly fork of Google's Gemma 4 chat template, | |\n| tuned for open-source agentic coding harnesses like: | |\n| - anomalyco/opencode (https://github.com/anomalyco/opencode) | |\n| - earendil-works/pi (https://github.com/earendil-works/pi) | |\n| - openclaw, OpenHarness, similar Claude-Code-style harnesses | |\n| WHY THIS FORK EXISTS | |\n| -------------------- | |\n| The upstream chat template at google/gemma-4-31B-it is correct for | |\n| chat use, but four real edge cases bite agentic coding harnesses: | |\n| 1. tool_call.arguments arriving as a JSON string (Vercel AI SDK and | |\n| several OpenAI-compatible adapters serialize this way) is silently | |\n| wrapped in extra braces by the upstream template, producing invalid | |\n| Gemma 4 DSL like call:fn{{\"city\":\"Tokyo\"}} — nested braces, JSON | |\n| colons, and quoted keys, none of which the model was trained on. | |\n| Symptom: degraded tool-call accuracy, mysterious arguments collapse | |\n| to {} on repeated calls. | |\n| 2. Prior-turn reasoning is dropped from history. The model card says | |\n| \"historical model output should only include the final response,\" | |\n| but agentic harnesses doing multi-step tool calls benefit | |\n| materially from keeping the prior reasoning visible. The Qwen3.6 | |\n| analogue of this bug is documented at: | |\n| https://github.com/earendil-works/pi/issues/3325 | |\n| Symptom (on Qwen and Gemma alike): after 2-3 turns, every tool call | |\n| collapses to arguments: {} even though the model's prior reasoning | |\n| correctly identified the parameters it needed. | |\n| 3. enable_thinking defaults to FALSE in the upstream template, and | |\n| most OpenAI-compatible adapters drop unknown request fields: | |\n| https://github.com/anomalyco/opencode/issues/24264 | |\n| So the harness ends up with thinking permanently off, agentic | |\n| tool-call accuracy suffers, and there's no obvious failure signal. | |\n| 4. JSON null values inside tool_call.arguments render as the bare | |\n| string \"None\" (Python repr of None survives Jinja). Optional | |\n| fields are very common in coding tools (find_files, search: | |\n| pattern=..., language=null) and this corrupts the prompt silently. | |\n| This fork is also forked from a private engineering fork used in | |\n| internal harnesses; the public copy reuses the same five patches but | |\n| adds expanded comments, removes references to private design docs, | |\n| and ships with a self-contained pytest conformance suite. | |\n| PATCH INVENTORY (full details next to each patch site below) | |\n| ------------------------------------------------------------ | |\n| P1 format_argument: emit JSON null instead of bare \"None\" | |\n| P2 enable_thinking defaults to TRUE | |\n| P3 tool_call.arguments as string: RAISE instead of silent corruption | |\n| P4 preserve_thinking kwarg (default TRUE) keeps prior <|channel> | |\n| P5 fix HF discussion #62 turn-tag close asymmetry | |\n| INVARIANT | |\n| --------- | |\n| With enable_thinking=False AND preserve_thinking=False passed | |\n| explicitly, this template renders byte-for-byte identical to the | |\n| upstream verbatim template on every input that doesn't hit P1, P3, | |\n| or P5's bug sites. The conformance suite at | |\n| tests/test_custom_pub_chat_template.py | |\n| locks this in across 21 representative cases. | |\n| USAGE | |\n| ----- | |\n| Server side (e.g. vLLM or SGLang): | |\n| --chat-template /path/to/custom_pub_chat_template_gemma4.jinja | |\n| Harness side: no changes required for the common case. If you need | |\n| to force defaults off (e.g. to match upstream behaviour exactly): | |\n| { | |\n| \"extra_body\": { | |\n| \"chat_template_kwargs\": { | |\n| \"enable_thinking\": false, | |\n| \"preserve_thinking\": false | |\n| } | |\n| } | |\n| } | |\n| For opencode-style providers, this maps to the `chat_template_args` | |\n| field in models config; for pi, set thinkingFormat appropriately | |\n| in the provider's compat block and pi will inject these kwargs. | |\n| PINS | |\n| ---- | |\n| Forked from google/gemma-4-31B-it @ fcf2302760ae9c6e528a8dbba9dd636e56848237 | |\n| Fork date: 2026-05-22 | |\n| License: Apache 2.0 (same as upstream) | |\n| Maintainer: see repo README | |\n| ---------------------------------------------------------------------#} | |\n| {%- macro format_parameters(properties, required, filter_keys=false) -%} | |\n| {%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%} | |\n| {%- set ns = namespace(found_first=false) -%} | |\n| {%- for key, value in properties | dictsort -%} | |\n| {%- set add_comma = false -%} | |\n| {%- if not filter_keys or key not in standard_keys -%} | |\n| {%- if ns.found_first %},{% endif -%} | |\n| {%- set ns.found_first = true -%} | |\n| {{ key }}:{ | |\n| {%- if value['description'] -%} | |\n| description:<|\"|>{{ value['description'] }}<|\"|> | |\n| {%- set add_comma = true -%} | |\n| {%- endif -%} | |\n| {%- if value['type'] | upper == 'STRING' -%} | |\n| {%- if value['enum'] -%} | |\n| {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} | |\n| enum:{{ format_argument(value['enum']) }} | |\n| {%- endif -%} | |\n| {%- elif value['type'] | upper == 'ARRAY' -%} | |\n| {%- if value['items'] is mapping and value['items'] -%} | |\n| {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} | |\n| items:{ | |\n| {%- set ns_items = namespace(found_first=false) -%} | |\n| {%- for item_key, item_value in value['items'] | dictsort -%} | |\n| {%- if item_value is not none -%} | |\n| {%- if ns_items.found_first %},{% endif -%} | |\n| {%- set ns_items.found_first = true -%} | |\n| {%- if item_key == 'properties' -%} | |\n| properties:{ | |\n| {%- if item_value is mapping -%} | |\n| {{- format_parameters(item_value, value['items']['required'] | default([])) -}} | |\n| {%- endif -%} | |\n| } | |\n| {%- elif item_key == 'required' -%} | |\n| required:[ | |\n| {%- for req_item in item_value -%} | |\n| <|\"|>{{- req_item -}}<|\"|> | |\n| {%- if not loop.last %},{% endif -%} | |\n| {%- endfor -%} | |\n| ] | |\n| {%- elif item_key == 'type' -%} | |\n| {%- if item_value is string -%} | |\n| type:{{ format_argument(item_value | upper) }} | |\n| {%- else -%} | |\n| type:{{ format_argument(item_value | map('upper') | list) }} | |\n| {%- endif -%} | |\n| {%- else -%} | |\n| {{ item_key }}:{{ format_argument(item_value) }} | |\n| {%- endif -%} | |\n| {%- endif -%} | |\n| {%- endfor -%} | |\n| } | |\n| {%- endif -%} | |\n| {%- endif -%} | |\n| {%- if value['nullable'] %} | |\n| {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} | |\n| nullable:true | |\n| {%- endif -%} | |\n| {%- if value['type'] | upper == 'OBJECT' -%} | |\n| {%- if value['properties'] is defined and value['properties'] is mapping -%} | |\n| {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} | |\n| properties:{ | |\n| {{- format_parameters(value['properties'], value['required'] | default([])) -}} | |\n| } | |\n| {%- elif value is mapping -%} | |\n| {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} | |\n| properties:{ | |\n| {{- format_parameters(value, value['required'] | default([]), filter_keys=true) -}} | |\n| } | |\n| {%- endif -%} | |\n| {%- if value['required'] -%} | |\n| {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} | |\n| required:[ | |\n| {%- for item in value['required'] | default([]) -%} | |\n| <|\"|>{{- item -}}<|\"|> | |\n| {%- if not loop.last %},{% endif -%} | |\n| {%- endfor -%} | |\n| ] | |\n| {%- endif -%} | |\n| {%- endif -%} | |\n| {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} | |\n| type:<|\"|>{{ value['type'] | upper }}<|\"|>} | |\n| {%- endif -%} | |\n| {%- endfor -%} | |\n| {%- endmacro -%} | |\n| {%- macro format_function_declaration(tool_data) -%} | |\n| declaration:{{- tool_data['function']['name'] -}}{description:<|\"|>{{- tool_data['function']['description'] -}}<|\"|> | |\n| {%- set params = tool_data['function']['parameters'] -%} | |\n| {%- if params -%} | |\n| ,parameters:{ | |\n| {%- if params['properties'] -%} | |\n| properties:{ {{- format_parameters(params['properties'], params['required']) -}} }, | |\n| {%- endif -%} | |\n| {%- if params['required'] -%} | |\n| required:[ | |\n| {%- for item in params['required'] -%} | |\n| <|\"|>{{- item -}}<|\"|> | |\n| {{- ',' if not loop.last -}} | |\n| {%- endfor -%} | |\n| ], | |\n| {%- endif -%} | |\n| {%- if params['type'] -%} | |\n| type:<|\"|>{{- params['type'] | upper -}}<|\"|>} | |\n| {%- endif -%} | |\n| {%- endif -%} | |\n| {%- if 'response' in tool_data['function'] -%} | |\n| {%- set response_declaration = tool_data['function']['response'] -%} | |\n| ,response:{ | |\n| {%- if response_declaration['description'] -%} | |\n| description:<|\"|>{{- response_declaration['description'] -}}<|\"|>, | |\n| {%- endif -%} | |\n| {%- if response_declaration['type'] | upper == 'OBJECT' -%} | |\n| type:<|\"|>{{- response_declaration['type'] | upper -}}<|\"|>} | |\n| {%- endif -%} | |\n| {%- endif -%} | |\n| } | |\n| {%- endmacro -%} | |\n| {%- macro format_argument(argument, escape_keys=True) -%} | |\n| {#- P1 (public fork): emit JSON null for None values rather than the | |\n| bare string \"None\". Jinja's default coercion of Python's None | |\n| goes through str(None) -> \"None\", which then leaks into the | |\n| Gemma 4 DSL as a literal token the model has never been trained | |\n| on. Common bite path: a coding tool's optional argument | |\n| (language=null in a find-files call, after=null in a search, | |\n| etc.) → upstream emits after:None in the DSL → model | |\n| confusion. We emit after:null instead, matching the JSON wire | |\n| format the model has actually seen. | |\n| Branch ordering: `is none` must precede `is string`, `is | |\n| mapping`, `is sequence`, etc., because None matches NONE of | |\n| them in Jinja's type tests but the final else-branch | |\n| ({{ argument }}) would otherwise stringify it. -#} | |\n| {%- if argument is none -%} | |\n| {{- 'null' -}} | |\n| {%- elif argument is string -%} | |\n| {{- '<|\"|>' + argument + '<|\"|>' -}} | |\n| {%- elif argument is boolean -%} | |\n| {{- 'true' if argument else 'false' -}} | |\n| {%- elif argument is mapping -%} | |\n| {{- '{' -}} | |\n| {%- set ns = namespace(found_first=false) -%} | |\n| {%- for key, value in argument | dictsort -%} | |\n| {%- if ns.found_first %},{% endif -%} | |\n| {%- set ns.found_first = true -%} | |\n| {%- if escape_keys -%} | |\n| {{- '<|\"|>' + key + '<|\"|>' -}} | |\n| {%- else -%} | |\n| {{- key -}} | |\n| {%- endif -%} | |\n| :{{- format_argument(value, escape_keys=escape_keys) -}} | |\n| {%- endfor -%} | |\n| {{- '}' -}} | |\n| {%- elif argument is sequence -%} | |\n| {{- '[' -}} | |\n| {%- for item in argument -%} | |\n| {{- format_argument(item, escape_keys=escape_keys) -}} | |\n| {%- if not loop.last %},{% endif -%} | |\n| {%- endfor -%} | |\n| {{- ']' -}} | |\n| {%- else -%} | |\n| {{- argument -}} | |\n| {%- endif -%} | |\n| {%- endmacro -%} | |\n| {%- macro strip_thinking(text) -%} | |\n| {%- set ns = namespace(result='') -%} | |\n| {%- for part in text.split('<channel|>') -%} | |\n| {%- if '<|channel>' in part -%} | |\n| {%- set ns.result = ns.result + part.split('<|channel>')[0] -%} | |\n| {%- else -%} | |\n| {%- set ns.result = ns.result + part -%} | |\n| {%- endif -%} | |\n| {%- endfor -%} | |\n| {{- ns.result | trim -}} | |\n| {%- endmacro -%} | |\n| {%- macro format_tool_response_block(tool_name, response) -%} | |\n| {{- '<|tool_response>' -}} | |\n| {%- if response is mapping -%} | |\n| {{- 'response:' + tool_name + '{' -}} | |\n| {%- for key, value in response | dictsort -%} | |\n| {{- key -}}:{{- format_argument(value, escape_keys=False) -}} | |\n| {%- if not loop.last %},{% endif -%} | |\n| {%- endfor -%} | |\n| {{- '}' -}} | |\n| {%- else -%} | |\n| {{- 'response:' + tool_name + '{value:' + format_argument(response, escape_keys=False) + '}' -}} | |\n| {%- endif -%} | |\n| {{- '<tool_response|>' -}} | |\n| {%- endmacro -%} | |\n| {%- set ns = namespace(prev_message_type=None) -%} | |\n| {%- set loop_messages = messages -%} | |\n| {#- P2 (public fork): default enable_thinking to TRUE. | |\n| Why: Gemma 4's upstream template defaults enable_thinking to False | |\n| (or undefined). This is wrong for agentic coding harnesses for two | |\n| reasons: | |\n| 1. Google's own model card: thinking \"significantly enhances | |\n| function-calling accuracy\" — and tool calling IS the core | |\n| contract that coding harnesses use the model for. Defaulting it | |\n| off means most opencode/pi users see degraded tool accuracy and | |\n| have no obvious way to fix it. | |\n| 2. Most OpenAI-compatible SDKs (notably Vercel AI SDK used by | |\n| opencode) strip unknown request fields, so a harness that tries | |\n| to pass chat_template_kwargs.enable_thinking=true per request | |\n| has it silently dropped. See: | |\n| https://github.com/anomalyco/opencode/issues/24264 | |\n| Flipping the SERVER-SIDE default to True makes \"the agentic | |\n| happy-path\" the default and lets harnesses that explicitly want | |\n| chat-only behaviour override it to false per request: | |\n| {\"extra_body\":{\"chat_template_kwargs\":{\"enable_thinking\":false}}} | |\n| After this `set`, enable_thinking is unconditionally defined as a | |\n| bool, so downstream `is defined` guards are dropped. -#} | |\n| {%- set enable_thinking = enable_thinking | default(true) -%} | |\n| {{- bos_token -}} | |\n| {#- Handle System/Tool Definitions Block -#} | |\n| {%- if enable_thinking or tools or messages[0]['role'] in ['system', 'developer'] -%} | |\n| {{- '<|turn>system\\n' -}} | |\n| {#- Inject Thinking token at the very top of the FIRST system turn -#} | |\n| {%- if enable_thinking -%} | |\n| {{- '<|think|>\\n' -}} | |\n| {%- set ns.prev_message_type = 'think' -%} | |\n| {%- endif -%} | |\n| {%- if messages[0]['role'] in ['system', 'developer'] -%} | |\n| {%- if messages[0]['content'] is string -%} | |\n| {{- messages[0]['content'] | trim -}} | |\n| {%- elif messages[0]['content'] is sequence -%} | |\n| {%- for item in messages[0]['content'] -%} | |\n| {{- item['text'] | trim + ' '-}} | |\n| {%- endfor -%} | |\n| {%- endif -%} | |\n| {%- set loop_messages = messages[1:] -%} | |\n| {%- endif -%} | |\n| {%- if tools -%} | |\n| {%- for tool in tools %} | |\n| {{- '<|tool>' -}} | |\n| {{- format_function_declaration(tool) | trim -}} | |\n| {{- '<tool|>' -}} | |\n| {%- endfor %} | |\n| {%- set ns.prev_message_type = 'tool' -%} | |\n| {%- endif -%} | |\n| {{- '<turn|>\\n' -}} | |\n| {%- endif %} | |\n| {#- P4 (public fork): preserve_thinking kwarg, default TRUE. | |\n| Why: upstream's reasoning re-emission gate fires only when an | |\n| assistant message (a) carries `reasoning`/` reasoning_content`, | |\n| (b) has tool_calls, AND (c) is AFTER the last user message. That | |\n| third clause is what causes the canonical multi-turn-tool-loop | |\n| breakage: | |\n| User: \"find files matching '*.py' in src\" | |\n| Assistant: (reasoning=...calling find_files...) tool_call: | |\n| find_files(pattern='*.py', dir='src') | |\n| Tool: [result list] | |\n| User: \"now look for '*.ts' too\" | |\n| Assistant: (reasoning=...) tool_call: find_files(pattern={}, dir={}) | |\n| ↑↑↑ arguments collapse to empty here because the prior | |\n| reasoning the model would have learned to imitate is | |\n| invisible — the previous-turn <|channel> was dropped. | |\n| The same shape was reported on Qwen3.6 and resolved by the | |\n| preserve_thinking kwarg there: | |\n| https://github.com/earendil-works/pi/issues/3325 | |\n| Gemma 4's model card says \"historical model output should only | |\n| include the final response\" — that guidance is correct for plain | |\n| chat but actively harmful for multi-turn agentic tool calling. P4 | |\n| optionally drops the (c) gate so prior reasoning stays visible to | |\n| the model on subsequent turns. | |\n| Set preserve_thinking=false to recover upstream behaviour exactly | |\n| (used by the conformance suite to verify byte-identity). -#} | |\n| {%- set preserve_thinking = preserve_thinking | default(true) -%} | |\n| {#- Pre-scan: find last user message index for reasoning guard -#} | |\n| {%- set ns_turn = namespace(last_user_idx=-1) -%} | |\n| {%- for i in range(loop_messages | length) -%} | |\n| {%- if loop_messages[i]['role'] == 'user' -%} | |\n| {%- set ns_turn.last_user_idx = i -%} | |\n| {%- endif -%} | |\n| {%- endfor -%} | |\n| {#- Loop through messages -#} | |\n| {%- for message in loop_messages -%} | |\n| {%- if message['role'] != 'tool' -%} | |\n| {%- set ns.prev_message_type = None -%} | |\n| {%- set role = 'model' if message['role'] == 'assistant' else message['role'] -%} | |\n| {#- Detect continuation: suppress duplicate <|turn>model when previous non-tool message was also assistant -#} | |\n| {%- set prev_nt = namespace(role=None, found=false) -%} | |\n| {%- if loop.index0 > 0 -%} | |\n| {%- for j in range(loop.index0 - 1, -1, -1) -%} | |\n| {%- if not prev_nt.found -%} | |\n| {%- if loop_messages[j]['role'] != 'tool' -%} | |\n| {%- set prev_nt.role = loop_messages[j]['role'] -%} | |\n| {%- set prev_nt.found = true -%} | |\n| {%- endif -%} | |\n| {%- endif -%} | |\n| {%- endfor -%} | |\n| {%- endif -%} | |\n| {%- set continue_same_model_turn = (role == 'model' and prev_nt.role == 'assistant') -%} | |\n| {%- if not continue_same_model_turn -%} | |\n| {{- '<|turn>' + role + '\\n' }} | |\n| {%- endif -%} | |\n| {#- Render reasoning/reasoning_content as thinking channel. | |\n| Upstream gate (all three required to re-emit): | |\n| (a) the message carries reasoning or reasoning_content, | |\n| (b) the message has tool_calls, | |\n| (c) the message is after the last user message in history. | |\n| P4 (public fork): when preserve_thinking is true (default), drop | |\n| clause (c) so prior assistant turns' <|channel> blocks survive. | |\n| See the long P4 comment above the pre-scan for why this matters | |\n| for agentic tool loops. The (b) gate stays — re-emitting a | |\n| <|channel> on a finalised text-only assistant turn is not in | |\n| the model's training distribution. -#} | |\n| {%- set thinking_text = message.get('reasoning') or message.get('reasoning_content') -%} | |\n| {%- set thinking_gate = (loop.index0 > ns_turn.last_user_idx) or preserve_thinking -%} | |\n| {%- if thinking_text and thinking_gate and message.get('tool_calls') -%} | |\n| {{- '<|channel>thought\\n' + thinking_text + '\\n<channel|>' -}} | |\n| {%- endif -%} | |\n| {%- if message['tool_calls'] -%} | |\n| {%- for tool_call in message['tool_calls'] -%} | |\n| {%- set function = tool_call['function'] -%} | |\n| {{- '<|tool_call>call:' + function['name'] + '{' -}} | |\n| {%- if function['arguments'] is mapping -%} | |\n| {%- set ns_args = namespace(found_first=false) -%} | |\n| {%- for key, value in function['arguments'] | dictsort -%} | |\n| {%- if ns_args.found_first %},{% endif -%} | |\n| {%- set ns_args.found_first = true -%} | |\n| {{- key -}}:{{- format_argument(value, escape_keys=False) -}} | |\n| {%- endfor -%} | |\n| {%- elif function['arguments'] is none -%} | |\n| {#- P3 (public fork): None / missing arguments is | |\n| valid (means: call this tool with no args). | |\n| Emit an empty {} via the empty for-loop above. -#} | |\n| {%- else -%} | |\n| {#- P3 (public fork): refuse string (or any other | |\n| non-mapping) arguments rather than silently | |\n| corrupting the prompt. | |\n| Bug surface: many OpenAI-compatible SDKs (most | |\n| notably Vercel AI SDK, used by opencode) hand | |\n| tool_call.arguments back as a JSON-encoded | |\n| STRING — e.g. '{\"city\":\"Tokyo\"}' — rather | |\n| than the already-deserialized object. The | |\n| upstream Gemma 4 template silently emits this | |\n| string verbatim inside an extra pair of | |\n| braces, producing invalid Gemma 4 DSL: | |\n| call:fn{{\"city\":\"Tokyo\"}} | |\n| (nested braces, JSON colons, quoted keys — | |\n| none of which the model has been trained on). | |\n| The model usually still produces a plausible | |\n| response, which makes the bug INSIDIOUS: it | |\n| looks like a quality problem with the model, | |\n| not a prompt-corruption bug in the harness. | |\n| Fix: harnesses MUST deserialize | |\n| tool_calls[].function.arguments | |\n| exactly once on ingest and store the object. | |\n| See the canonical pi-side discussion: | |\n| https://github.com/earendil-works/pi/issues/3325 | |\n| We raise here so the bug surfaces at the | |\n| server (an obvious HTTP error to debug) | |\n| rather than as a quiet model-output | |\n| regression. -#} | |\n| {{- raise_exception( | |\n| \"custom_pub_chat_template_gemma4: \" | |\n| \"tool_calls[].function.arguments must be a JSON \" | |\n| \"object (mapping). Got a \" | |\n| ~ (function['arguments'] | string | length | string) | |\n| ~ \"-char \" | |\n| ~ (function['arguments'].__class__.__name__ if function['arguments'].__class__ is defined else 'non-mapping') | |\n| ~ \". This is almost always the harness handing back \" | |\n| \"a JSON-encoded STRING rather than the deserialized \" | |\n| \"object. Deserialize once on ingest and store the \" | |\n| \"object. See: github.com/earendil-works/pi/issues/3325\" | |\n| ) -}} | |\n| {%- endif -%} | |\n| {{- '}<tool_call|>' -}} | |\n| {%- endfor -%} | |\n| {%- set ns.prev_message_type = 'tool_call' -%} | |\n| {%- endif -%} | |\n| {%- set ns_tr_out = namespace(flag=false) -%} | |\n| {%- if message.get('tool_responses') -%} | |\n| {#- Legacy: tool_responses embedded on the assistant message (Google/Gemma native) -#} | |\n| {%- for tool_response in message['tool_responses'] -%} | |\n| {{- format_tool_response_block(tool_response['name'] | default('unknown'), tool_response['response']) -}} | |\n| {%- set ns_tr_out.flag = true -%} | |\n| {%- set ns.prev_message_type = 'tool_response' -%} | |\n| {%- endfor -%} | |\n| {%- elif message.get('tool_calls') -%} | |\n| {#- OpenAI Chat Completions: forward-scan consecutive role:tool messages -#} | |\n| {%- set ns_tool_scan = namespace(stopped=false) -%} | |\n| {%- for k in range(loop.index0 + 1, loop_messages | length) -%} | |\n| {%- if ns_tool_scan.stopped -%} | |\n| {%- elif loop_messages[k]['role'] != 'tool' -%} | |\n| {%- set ns_tool_scan.stopped = true -%} | |\n| {%- else -%} | |\n| {%- set follow = loop_messages[k] -%} | |\n| {#- Resolve tool_call_id to function name -#} | |\n| {%- set ns_tname = namespace(name=follow.get('name') | default('unknown')) -%} | |\n| {%- for tc in message['tool_calls'] -%} | |\n| {%- if tc.get('id') == follow.get('tool_call_id') -%} | |\n| {%- set ns_tname.name = tc['function']['name'] -%} | |\n| {%- endif -%} | |\n| {%- endfor -%} | |\n| {#- Handle content as string or content-parts array -#} | |\n| {%- set tool_body = follow.get('content') -%} | |\n| {%- if tool_body is string -%} | |\n| {{- format_tool_response_block(ns_tname.name, tool_body) -}} | |\n| {%- elif tool_body is sequence and tool_body is not string -%} | |\n| {%- set ns_txt = namespace(s='') -%} | |\n| {%- for part in tool_body -%} | |\n| {%- if part.get('type') == 'text' -%} | |\n| {%- set ns_txt.s = ns_txt.s + (part.get('text') | default('')) -%} | |\n| {%- endif -%} | |\n| {%- endfor -%} | |\n| {{- format_tool_response_block(ns_tname.name, ns_txt.s) -}} | |\n| {%- for part in tool_body -%} | |\n| {%- if part.get('type') == 'image' -%} | |\n| {{- '<|image|>' -}} | |\n| {%- elif part.get('type') == 'audio' -%} | |\n| {{- '<|audio|>' -}} | |\n| {%- elif part.get('type') == 'video' -%} | |\n| {{- '<|video|>' -}} | |\n| {%- endif -%} | |\n| {%- endfor -%} | |\n| {%- else -%} | |\n| {{- format_tool_response_block(ns_tname.name, tool_body) -}} | |\n| {%- endif -%} | |\n| {%- set ns_tr_out.flag = true -%} | |\n| {%- set ns.prev_message_type = 'tool_response' -%} | |\n| {%- endif -%} | |\n| {%- endfor -%} | |\n| {%- endif -%} | |\n| {%- set captured_content -%} | |\n| {%- if message['content'] is string -%} | |\n| {%- if role == 'model' -%} | |\n| {{- strip_thinking(message['content']) -}} | |\n| {%- else -%} | |\n| {{- message['content'] | trim -}} | |\n| {%- endif -%} | |\n| {%- elif message['content'] is sequence -%} | |\n| {%- for item in message['content'] -%} | |\n| {%- if item['type'] == 'text' -%} | |\n| {%- if role == 'model' -%} | |\n| {{- strip_thinking(item['text']) -}} | |\n| {%- else -%} | |\n| {{- item['text'] | trim -}} | |\n| {%- endif -%} | |\n| {%- elif item['type'] == 'image' -%} | |\n| {{- '<|image|>' -}} | |\n| {%- set ns.prev_message_type = 'image' -%} | |\n| {%- elif item['type'] == 'audio' -%} | |\n| {{- '<|audio|>' -}} | |\n| {%- set ns.prev_message_type = 'audio' -%} | |\n| {%- elif item['type'] == 'video' -%} | |\n| {{- '<|video|>' -}} | |\n| {%- set ns.prev_message_type = 'video' -%} | |\n| {%- endif -%} | |\n| {%- endfor -%} | |\n| {%- endif -%} | |\n| {%- endset -%} | |\n| {{- captured_content -}} | |\n| {%- set has_content = captured_content | trim | length > 0 -%} | |\n| {#- P5 (public fork): symmetric continuation close-suppression | |\n| for HF discussion #62. | |\n| The bug: upstream's open suppression at the top of this | |\n| iteration drops the `<|turn>model\\n` header when the | |\n| previous non-tool message was also assistant — but the | |\n| close below ALWAYS emits `<turn|>\\n`. Two back-to-back | |\n| text-only assistant messages therefore render as: | |\n| <|turn>model\\npart 1<turn|>\\npart 2<turn|>\\n | |\n| That's one open, two closes — malformed. The model | |\n| (Google-confirmed in HF discussion #62) sees it as a | |\n| truncated and re-opened turn, which destabilises long | |\n| multi-step agentic histories that accumulate consecutive | |\n| assistant messages. | |\n| Fix: forward-scan for the next non-tool message. If it is | |\n| another assistant AND this iteration is a TEXT-ONLY | |\n| assistant message (no tool_calls, no tool_responses), the | |\n| next iteration will continue this same turn frame, so | |\n| suppress this iteration's close and emit a single `\\n` so | |\n| the two contents don't byte-glue together. | |\n| The narrowing condition (`not message.get('tool_calls') | |\n| and not ns_tr_out.flag`) is critical: the tool-call + | |\n| tool-response chain MUST close normally so the model still | |\n| sees a balanced turn frame around the `<|tool_response>` | |\n| block. Conformance test T13 locks this in. -#} | |\n| {%- set next_nt = namespace(role=None, found=false) -%} | |\n| {%- for j in range(loop.index0 + 1, loop_messages | length) -%} | |\n| {%- if not next_nt.found -%} | |\n| {%- if loop_messages[j]['role'] != 'tool' -%} | |\n| {%- set next_nt.role = loop_messages[j]['role'] -%} | |\n| {%- set next_nt.found = true -%} | |\n| {%- endif -%} | |\n| {%- endif -%} | |\n| {%- endfor -%} | |\n| {%- set continues_into_next = ( | |\n| role == 'model' | |\n| and next_nt.role == 'assistant' | |\n| and not message.get('tool_calls') | |\n| and not ns_tr_out.flag | |\n| ) -%} | |\n| {%- if ns.prev_message_type == 'tool_call' and not ns_tr_out.flag -%} | |\n| {{- '<|tool_response>' -}} | |\n| {%- elif continues_into_next -%} | |\n| {{- '\\n' -}} | |\n| {%- elif not (ns_tr_out.flag and not has_content) -%} | |\n| {{- '<turn|>\\n' -}} | |\n| {%- endif -%} | |\n| {%- endif -%} | |\n| {%- endfor -%} | |\n| {%- if add_generation_prompt -%} | |\n| {%- if ns.prev_message_type != 'tool_response' and ns.prev_message_type != 'tool_call' -%} | |\n| {{- '<|turn>model\\n' -}} | |\n| {#- When thinking is disabled, the upstream contract is to | |\n| pre-fill an empty `<|channel>thought\\n<channel|>` block so | |\n| the model skips reasoning. After P2's set at the top of | |\n| the file, `enable_thinking` is unconditionally a bool, so | |\n| the upstream `| default(false)` is unnecessary. (It also | |\n| had a Jinja precedence trap: `|` binds tighter than `not`, | |\n| parsing as `not (enable_thinking | default(false))`. The | |\n| simple `not enable_thinking` form is equivalent and | |\n| clearer.) -#} | |\n| {%- if not enable_thinking -%} | |\n| {{- '<|channel>thought\\n<channel|>' -}} | |\n| {%- endif -%} | |\n| {%- endif -%} | |\n| {%- endif -%} |", "url": "https://wpnews.pro/news/a-drop-in-replacement-chat-template-for-google-gemma-4-31b-it-tuned-for-open", "canonical_source": "https://gist.github.com/jscott3201/ad69c4ffbd79f18b11a0f6a94c94fadf", "published_at": "2026-05-23 03:01:26+00:00", "updated_at": "2026-05-30 17:14:00.905636+00:00", "lang": "en", "topics": ["large-language-models", "ai-agents", "ai-tools", "artificial-intelligence", "natural-language-processing"], "entities": ["Google", "Gemma 4", "Vercel AI SDK", "OpenAI", "anomalyco/opencode", "earendil-works/pi", "Qwen3.6", "Claude-Code"], "alternates": {"html": "https://wpnews.pro/news/a-drop-in-replacement-chat-template-for-google-gemma-4-31b-it-tuned-for-open", "markdown": "https://wpnews.pro/news/a-drop-in-replacement-chat-template-for-google-gemma-4-31b-it-tuned-for-open.md", "text": "https://wpnews.pro/news/a-drop-in-replacement-chat-template-for-google-gemma-4-31b-it-tuned-for-open.txt", "jsonld": "https://wpnews.pro/news/a-drop-in-replacement-chat-template-for-google-gemma-4-31b-it-tuned-for-open.jsonld"}}