{"slug": "a-drop-in-replacement-chat-template-for-qwen-qwen3-6-27b-tuned-for-open-source", "title": "A drop-in replacement chat template for Qwen/Qwen3.6-27B tuned for open-source agentic coding harnesses.", "summary": "A developer has released a drop-in replacement chat template for Qwen/Qwen3.6-27B that fixes six critical bugs affecting open-source agentic coding harnesses. The fork addresses issues including multi-turn tool argument collapse caused by the upstream template's `preserve_thinking=false` default, rejection of the `developer` message role used by modern coding tools, and crashes from tool call arguments arriving as JSON strings. The template also patches the upstream's failure to recognize malformed `<tool_call>` tags and wasteful token usage from passing OpenAI envelope tool definitions verbatim.", "body_md": "| {#--------------------------------------------------------------------- | |\n| custom_pub_chat_template_qwen36.jinja | |\n| ===================================== | |\n| A public, harness-friendly fork of Qwen's Qwen3.6-27B chat template, | |\n| tuned for open-source agentic coding harnesses like: | |\n| - anomalyco/opencode (https://github.com/anomalyco/opencode) | |\n| - earendil-works/pi (https://github.com/earendil-works/pi) | |\n| - openclaw, OpenHarness, similar Claude-Code-style harnesses | |\n| WHY THIS FORK EXISTS | |\n| -------------------- | |\n| The upstream chat template at `Qwen/Qwen3.6-27B` is correct for chat | |\n| use, but six real edge cases bite agentic coding harnesses pointing | |\n| at a self-hosted SGLang / vLLM / llama.cpp endpoint serving Qwen3.6: | |\n| 1. Multi-turn tool argument collapse. After 2-3 turns of calling the | |\n| same tool, the model emits arguments: {} despite its prior | |\n| reasoning correctly identifying the parameters. Root cause: the | |\n| upstream template defaults preserve_thinking=false, which means | |\n| prior-turn <think> blocks are silently dropped from history; the | |\n| model loses its own trace of \"how did I pick the parameters last | |\n| time?\" and degenerates. Documented at: | |\n| https://github.com/earendil-works/pi/issues/3325 | |\n| The Qwen3.6 model card explicitly states the model was post- | |\n| trained for \"Thinking Preservation\" in agent scenarios — the | |\n| preserve_thinking-FALSE default is wrong for our use case. | |\n| 2. The `developer` role rejected. Modern coding harnesses | |\n| (opencode, Claude Code, openclaw, Continue) send a `developer` | |\n| role for reasoning-capable models, following OpenAI's Responses | |\n| API convention. Upstream raises \"Unexpected message role\" — | |\n| crashing the entire request. Reported and documented at: | |\n| https://gist.github.com/sudoingX/c2facf7d8f7608c65c1024ef3b22d431 | |\n| (\"Qwen 3.5 GGUF templates reject the developer role sent by | |\n| OpenCode, Claude Code, and other modern agent tools.\") | |\n| 3. tool_call.arguments arriving as a JSON string crashes with a | |\n| cryptic Jinja error (\"Can only get item pairs from a mapping\"). | |\n| The Vercel AI SDK (used by opencode) and several other OpenAI- | |\n| compatible adapters hand arguments back as a JSON-encoded | |\n| STRING rather than the deserialized object. Diagnosing this | |\n| from the upstream error message is painful. Documented at: | |\n| https://github.com/earendil-works/pi/issues/3325 | |\n| https://github.com/anomalyco/opencode/issues/24264 | |\n| 4. The opening `<tool_call>` tag is sometimes omitted by the model | |\n| (documented at https://github.com/QwenLM/Qwen3-Coder/issues/475) | |\n| and `<tool_call>` can appear inside an unclosed `<think>` block | |\n| (https://github.com/ollama/ollama/issues/14493). The upstream | |\n| template's content-parsing only recognizes `</think>` and only | |\n| when properly closed, so reasoning bleeds into the conversation | |\n| content channel. Whitespace variants of `</think>` aren't | |\n| recognized either. | |\n| 5. The OpenAI envelope around tool definitions | |\n| ({\"type\":\"function\",\"function\":{...}}) is passed verbatim | |\n| through `tool | tojson`, wasting tokens and diverging from | |\n| what the model expects. Qwen's own most recent coder model, | |\n| Qwen3-Coder-Next, unwraps this envelope in its own canonical | |\n| chat template: | |\n| https://huggingface.co/Qwen/Qwen3-Coder-Next/blob/main/chat_template.jinja | |\n| (lines 35-37). The Qwen3.6-27B upstream template just hasn't | |\n| caught up to the newer convention. | |\n| 6. The upstream IMPORTANT instructions block is missing three | |\n| bullets that address the most common public Qwen3-Coder | |\n| failure modes: | |\n| - Omitting the opening <tool_call> tag (Qwen3-Coder #475) | |\n| - Indenting <tool_call> with leading whitespace | |\n| (https://github.com/block/goose/issues/6883) | |\n| - Nesting <tool_call> blocks instead of emitting parallel | |\n| calls | |\n| PATCH INVENTORY (full details next to each patch site below) | |\n| ------------------------------------------------------------ | |\n| Q1 preserve_thinking default flipped FALSE→TRUE | |\n| Q2 `developer` role accepted as alias for `system` | |\n| Q3 Raise a clear, debuggable error on string tool_call.arguments | |\n| Q4 Robust </think> variant handling + unclosed-think rescue | |\n| Q5 Unwrap OpenAI tool envelope to inner function spec (gated) | |\n| Q6 Strengthened IMPORTANT instructions block (gated) | |\n| INVARIANTS | |\n| ---------- | |\n| 1. STRICT-EQUIVALENCE: With kwargs | |\n| preserve_thinking=false, (recovers Q1) | |\n| unwrap_tool_envelope=false, (recovers Q5) | |\n| verbose_tool_instructions=false (recovers Q6) | |\n| AND inputs that don't exercise Q2 (no `developer` role), | |\n| Q3 (no string-typed arguments), or Q4 (no `</thinking>` or | |\n| whitespace variants of `</think>`), this template renders | |\n| byte-for-byte identical to upstream. The conformance suite | |\n| at tests/test_custom_pub_chat_template_qwen36.py locks this in | |\n| across the simple-input matrix. | |\n| 2. STRICT-SAFETY: For every input upstream handles without error, | |\n| this template handles correctly with semantically equivalent | |\n| or strictly safer output. The strict-where-upstream-silent | |\n| patches (Q3, Q4) only fire on inputs that hit the documented | |\n| bug surfaces. | |\n| USAGE | |\n| ----- | |\n| Server side (e.g. SGLang or vLLM): | |\n| # SGLang | |\n| python -m sglang.launch_server \\ | |\n| --model-path Qwen/Qwen3.6-27B \\ | |\n| --chat-template /path/to/custom_pub_chat_template_qwen36.jinja \\ | |\n| --tool-call-parser qwen3_coder \\ | |\n| --reasoning-parser qwen3 | |\n| # vLLM | |\n| vllm serve Qwen/Qwen3.6-27B \\ | |\n| --chat-template /path/to/custom_pub_chat_template_qwen36.jinja \\ | |\n| --tool-call-parser qwen3_coder \\ | |\n| --reasoning-parser qwen3 \\ | |\n| --enable-auto-tool-choice | |\n| Harness side: no changes required for the common case. The | |\n| defaults are tuned for agentic coding out of the box. If you need | |\n| to recover the upstream defaults explicitly: | |\n| { | |\n| \"extra_body\": { | |\n| \"chat_template_kwargs\": { | |\n| \"enable_thinking\": true, | |\n| \"preserve_thinking\": false, | |\n| \"unwrap_tool_envelope\": false, | |\n| \"verbose_tool_instructions\": false | |\n| } | |\n| } | |\n| } | |\n| For opencode-style providers, this maps to chat_template_args in | |\n| the model config; for pi, use compat.thinkingFormat=\"qwen-chat- | |\n| template\" and pi will inject the kwargs correctly. | |\n| PINS | |\n| ---- | |\n| Forked from Qwen/Qwen3.6-27B/chat_template.jinja | |\n| Upstream MD5: 52b6d51ae5b203cb67e64b648494dad2 (153 lines) | |\n| Fork date: 2026-05-25 | |\n| License: Apache 2.0 (same as upstream) | |\n| Maintainer: see repo README | |\n| ---------------------------------------------------------------------#} | |\n| {#- Vision counters (identical to upstream). -#} | |\n| {%- set image_count = namespace(value=0) %} | |\n| {%- set video_count = namespace(value=0) %} | |\n| {#- ============================================================================ | |\n| Content rendering macro. | |\n| Functionally identical to upstream's macro of the same name. The only | |\n| cosmetic difference is the `add_vision_id is defined and add_vision_id` | |\n| guard instead of upstream's bare `if add_vision_id` — a defensive | |\n| rewrite that prevents undefined-variable errors in some minijinja | |\n| runtimes (llama.cpp, MLX). No rendering-time behavior change for | |\n| Python Jinja2 (SGLang/vLLM) since both runtimes treat undefined as | |\n| falsy. | |\n| ============================================================================ -#} | |\n| {%- macro render_content(content, do_vision_count, is_system_content=false) %} | |\n| {%- if content is string %} | |\n| {{- content }} | |\n| {%- elif content is iterable and content is not mapping %} | |\n| {%- for item in content %} | |\n| {%- if 'image' in item or 'image_url' in item or item.type == 'image' %} | |\n| {%- if is_system_content %} | |\n| {{- raise_exception('System message cannot contain images.') }} | |\n| {%- endif %} | |\n| {%- if do_vision_count %} | |\n| {%- set image_count.value = image_count.value + 1 %} | |\n| {%- endif %} | |\n| {%- if add_vision_id is defined and add_vision_id %} | |\n| {{- 'Picture ' ~ image_count.value ~ ': ' }} | |\n| {%- endif %} | |\n| {{- '<|vision_start|><|image_pad|><|vision_end|>' }} | |\n| {%- elif 'video' in item or item.type == 'video' %} | |\n| {%- if is_system_content %} | |\n| {{- raise_exception('System message cannot contain videos.') }} | |\n| {%- endif %} | |\n| {%- if do_vision_count %} | |\n| {%- set video_count.value = video_count.value + 1 %} | |\n| {%- endif %} | |\n| {%- if add_vision_id is defined and add_vision_id %} | |\n| {{- 'Video ' ~ video_count.value ~ ': ' }} | |\n| {%- endif %} | |\n| {{- '<|vision_start|><|video_pad|><|vision_end|>' }} | |\n| {%- elif 'text' in item %} | |\n| {{- item.text }} | |\n| {%- else %} | |\n| {{- raise_exception('Unexpected item type in content.') }} | |\n| {%- endif %} | |\n| {%- endfor %} | |\n| {%- elif content is none or content is undefined %} | |\n| {{- '' }} | |\n| {%- else %} | |\n| {{- raise_exception('Unexpected content type.') }} | |\n| {%- endif %} | |\n| {%- endmacro %} | |\n| {#- ============================================================================ | |\n| Q1 (public fork): preserve_thinking default flipped FALSE → TRUE. | |\n| Why: upstream's preserve_thinking gate at the assistant-rendering site | |\n| is: | |\n| {%- if (preserve_thinking is defined and preserve_thinking is true) | |\n| or (loop.index0 > ns.last_query_index) %} | |\n| With preserve_thinking unset, prior-turn <think> blocks (assistant | |\n| turns at indices <= last_query_index) are dropped from history. The | |\n| model loses its own trace of how it chose tool arguments on prior | |\n| turns and degenerates after 2-3 multi-turn calls of the same tool. | |\n| The canonical public bug-report on this exact failure mode for | |\n| Qwen3.6 is `earendil-works/pi#3325`: | |\n| https://github.com/earendil-works/pi/issues/3325 | |\n| \"Qwen3.6 tool calls loop with empty arguments: qwen-chat-template | |\n| missing preserve_thinking ... After 2-3 turns every tool call has | |\n| arguments: {}.\" | |\n| The Qwen3.6 model card explicitly states (verbatim): | |\n| \"Qwen3.6 has been additionally trained to preserve and leverage | |\n| thinking traces from historical messages ... particularly | |\n| beneficial for agent scenarios.\" | |\n| So this is not just a workaround — preserve_thinking=true is the | |\n| model-card-recommended setting for agentic harnesses. The public | |\n| fork makes it the default. | |\n| Recover upstream behavior: pass preserve_thinking=false explicitly. | |\n| ============================================================================ -#} | |\n| {%- if preserve_thinking is not defined %} | |\n| {%- set preserve_thinking = true %} | |\n| {%- endif %} | |\n| {#- Q5 / Q6 (public fork): both gated by kwargs, default true. See the | |\n| patch sites below for the full rationale and citations. -#} | |\n| {%- if unwrap_tool_envelope is not defined %} | |\n| {%- set unwrap_tool_envelope = true %} | |\n| {%- endif %} | |\n| {%- if verbose_tool_instructions is not defined %} | |\n| {%- set verbose_tool_instructions = true %} | |\n| {%- endif %} | |\n| {%- if not messages %} | |\n| {{- raise_exception('No messages provided.') }} | |\n| {%- endif %} | |\n| {#- ============================================================================ | |\n| Q2 (public fork): `developer` role accepted as an alias for `system`. | |\n| Upstream's role check (in the index-0 system handling AND in the | |\n| main message loop) only accepts `system`; a `developer` role | |\n| raises \"Unexpected message role\" and crashes the request. | |\n| Modern coding harnesses (opencode, Claude Code, openclaw, Continue) | |\n| emit a `developer` role for reasoning-capable models, following | |\n| OpenAI's Responses API convention. This causes the entire request | |\n| to fail when pointed at a stock Qwen3.6 server. | |\n| Reference (gist documenting the bite for OpenCode + Qwen3.5): | |\n| https://gist.github.com/sudoingX/c2facf7d8f7608c65c1024ef3b22d431 | |\n| Below: we normalize the index-0 role for the upcoming system-block | |\n| decision, then in the main message loop we treat both as system. | |\n| The change is invisible for inputs that only use `system`. | |\n| ============================================================================ -#} | |\n| {%- if tools and tools is iterable and tools is not mapping %} | |\n| {{- '<|im_start|>system\\n' }} | |\n| {{- \"# Tools\\n\\nYou have access to the following functions:\\n\\n<tools>\" }} | |\n| {%- for tool in tools %} | |\n| {{- \"\\n\" }} | |\n| {#- Q5 (public fork): unwrap the OpenAI envelope. | |\n| Background: harnesses speaking OpenAI tool-call protocol send | |\n| tool definitions wrapped in {\"type\":\"function\",\"function\":{...}}. | |\n| Upstream passes the WHOLE wrapper through `tool | tojson`, | |\n| emitting an extra layer the model has to mentally peel off, | |\n| and wasting ~12 tokens per tool. | |\n| Qwen's own most recent coder model unwraps this envelope in | |\n| its canonical chat template: | |\n| https://huggingface.co/Qwen/Qwen3-Coder-Next/blob/main/chat_template.jinja | |\n| (lines 35-37: `{%- if tool.function is defined %}{%- set tool = | |\n| tool.function %}{%- endif %}`). | |\n| Qwen3.6-27B's upstream template predates that change; this | |\n| patch backports the unwrap behavior so Qwen3.6 sees the same | |\n| tool format Qwen3-Coder-Next was trained on. | |\n| Recover upstream behavior: pass unwrap_tool_envelope=false. -#} | |\n| {%- if unwrap_tool_envelope and tool.function is defined %} | |\n| {{- tool.function | tojson }} | |\n| {%- else %} | |\n| {{- tool | tojson }} | |\n| {%- endif %} | |\n| {%- endfor %} | |\n| {{- \"\\n</tools>\" }} | |\n| {#- Q6 (public fork): strengthened IMPORTANT instructions block. | |\n| Upstream's IMPORTANT block has 4 bullets. The strengthened | |\n| version adds three bullets that address documented public Qwen | |\n| coder failure modes: | |\n| - \"Do NOT omit the opening <tool_call> tag\": | |\n| https://github.com/QwenLM/Qwen3-Coder/issues/475 | |\n| - \"MUST be at the very beginning of a new line, with NO leading | |\n| spaces or indentation\": | |\n| https://github.com/block/goose/issues/6883 | |\n| - \"Do NOT nest <tool_call> blocks inside one another\": | |\n| same #6883 + Roo Code custom-XML interaction patterns | |\n| These bullets are pure additive guidance to the model; they | |\n| don't change tool-call wire-format behavior for well-formed | |\n| outputs, but they reduce error rates on the documented edge | |\n| cases. | |\n| Recover upstream behavior: pass verbose_tool_instructions=false. -#} | |\n| {%- if verbose_tool_instructions %} | |\n| {{- '\\n\\nIf you choose to call a function ONLY reply in the following format with NO suffix:\\n\\n<tool_call>\\n<function=example_function_name>\\n<parameter=example_parameter_1>\\nvalue_1\\n</parameter>\\n<parameter=example_parameter_2>\\nThis is the value for the second parameter\\nthat can span\\nmultiple lines\\n</parameter>\\n</function>\\n</tool_call>\\n\\n<IMPORTANT>\\nReminder:\\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags.\\n- Do NOT omit the opening <tool_call> tag. Every function call MUST be wrapped in a complete <tool_call>...</tool_call> block.\\n- The <tool_call> and <function> tags MUST be at the very beginning of a new line, with NO leading spaces or indentation.\\n- Required parameters MUST be specified.\\n- To call multiple functions, output a separate, completely closed <tool_call></tool_call> block for EACH function. Do NOT nest <tool_call> blocks inside one another.\\n- You may provide reasoning inside <think>...</think> blocks BEFORE the <tool_call>, but NOT after. After a tool call there must be NO suffix on the same turn.\\n- If no function call is needed, answer the question normally and do not mention function calls.\\n</IMPORTANT>' }} | |\n| {%- else %} | |\n| {{- '\\n\\nIf you choose to call a function ONLY reply in the following format with NO suffix:\\n\\n<tool_call>\\n<function=example_function_name>\\n<parameter=example_parameter_1>\\nvalue_1\\n</parameter>\\n<parameter=example_parameter_2>\\nThis is the value for the second parameter\\nthat can span\\nmultiple lines\\n</parameter>\\n</function>\\n</tool_call>\\n\\n<IMPORTANT>\\nReminder:\\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\\n- Required parameters MUST be specified\\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\\n</IMPORTANT>' }} | |\n| {%- endif %} | |\n| {#- Q2 (public fork): accept developer role at index 0. -#} | |\n| {%- if messages[0].role == 'system' or messages[0].role == 'developer' %} | |\n| {%- set content = render_content(messages[0].content, false, true)|trim %} | |\n| {%- if content %} | |\n| {{- '\\n\\n' + content }} | |\n| {%- endif %} | |\n| {%- endif %} | |\n| {{- '<|im_end|>\\n' }} | |\n| {%- else %} | |\n| {#- Q2 (public fork): accept developer role at index 0. -#} | |\n| {%- if messages[0].role == 'system' or messages[0].role == 'developer' %} | |\n| {%- set content = render_content(messages[0].content, false, true)|trim %} | |\n| {{- '<|im_start|>system\\n' + content + '<|im_end|>\\n' }} | |\n| {%- endif %} | |\n| {%- endif %} | |\n| {#- last_query_index walk (identical to upstream). When preserve_thinking=true | |\n| (the public fork's default), the index produced here is not consulted — | |\n| the assistant-render guard below only checks preserve_thinking first. -#} | |\n| {%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %} | |\n| {%- for message in messages[::-1] %} | |\n| {%- set index = (messages|length - 1) - loop.index0 %} | |\n| {%- if ns.multi_step_tool and message.role == \"user\" %} | |\n| {%- set content = render_content(message.content, false)|trim %} | |\n| {%- if not(content.startswith('<tool_response>') and content.endswith('</tool_response>')) %} | |\n| {%- set ns.multi_step_tool = false %} | |\n| {%- set ns.last_query_index = index %} | |\n| {%- endif %} | |\n| {%- endif %} | |\n| {%- endfor %} | |\n| {%- if ns.multi_step_tool %} | |\n| {{- raise_exception('No user query found in messages.') }} | |\n| {%- endif %} | |\n| {%- for message in messages %} | |\n| {%- set content = render_content(message.content, true)|trim %} | |\n| {%- if message.role == \"system\" or message.role == \"developer\" %} | |\n| {#- Q2 (public fork): both roles are valid at the start; upstream | |\n| rejected `developer` here. The system block was already rendered | |\n| above; nothing to emit per-message. -#} | |\n| {%- if not loop.first %} | |\n| {{- raise_exception('System/developer message must be at the beginning.') }} | |\n| {%- endif %} | |\n| {%- elif message.role == \"user\" %} | |\n| {{- '<|im_start|>' + message.role + '\\n' + content + '<|im_end|>' + '\\n' }} | |\n| {%- elif message.role == \"assistant\" %} | |\n| {#- ---------------------------------------------------------------- | |\n| Q4 (public fork): robust </think> variant handling + unclosed- | |\n| think rescue. | |\n| Upstream's content parsing only recognizes `</think>`, and only | |\n| when it appears with a properly opened `<think>` somewhere | |\n| earlier in the content. Three documented failure modes leak: | |\n| - The model emits `</thinking>` (long form) — upstream treats | |\n| the entire content as non-reasoning text, then `<think>` and | |\n| `</thinking>` literals leak into the model's view of history. | |\n| - Whitespace variants `</ think>` and `</think >` happen with | |\n| some quantization runtimes (especially older llama.cpp builds). | |\n| - `<tool_call>` appears INSIDE an unclosed `<think>` block (the | |\n| model started reasoning, decided to call a tool, and forgot | |\n| to close the think block first). | |\n| The Ollama equivalent of this bug: | |\n| https://github.com/ollama/ollama/issues/14493 | |\n| \"tool calls in the Qwen 3 and Qwen 3.5 model families would | |\n| not be parsed correctly if emitted during thinking\" | |\n| (fixed in Ollama 0.17.3). | |\n| The Qwen3-Coder equivalent (model omitting opening tag): | |\n| https://github.com/QwenLM/Qwen3-Coder/issues/475 | |\n| Q4 handles all four cases. The strict-improvement contract: | |\n| for any input upstream parses correctly (only `</think>`, | |\n| properly closed), behavior here is identical. | |\n| ---------------------------------------------------------------- #} | |\n| {%- set reasoning_content = '' %} | |\n| {%- if message.reasoning_content is string %} | |\n| {%- set reasoning_content = message.reasoning_content %} | |\n| {%- else %} | |\n| {%- set think_end = '' %} | |\n| {%- if '</think>' in content %} | |\n| {%- set think_end = '</think>' %} | |\n| {%- elif '</thinking>' in content %} | |\n| {%- set think_end = '</thinking>' %} | |\n| {%- elif '</ think>' in content %} | |\n| {%- set think_end = '</ think>' %} | |\n| {%- elif '</think >' in content %} | |\n| {%- set think_end = '</think >' %} | |\n| {%- endif %} | |\n| {%- if think_end %} | |\n| {%- set parts = content.split(think_end) %} | |\n| {%- set reasoning_content = parts[0] %} | |\n| {%- set content = parts[1:] | join(think_end) %} | |\n| {%- if '<think>' in reasoning_content %} | |\n| {%- set reasoning_content = reasoning_content.split('<think>')[1:] | join('<think>') %} | |\n| {%- endif %} | |\n| {%- elif '<think>' in content %} | |\n| {#- Unclosed think; rescue when followed by <tool_call> | |\n| (ollama#14493 pattern). -#} | |\n| {%- set prefix = content.split('<think>')[0] %} | |\n| {%- set think_part = content.split('<think>')[1:] | join('<think>') %} | |\n| {%- if '<tool_call>' in think_part %} | |\n| {%- set reasoning_content = think_part.split('<tool_call>')[0] %} | |\n| {%- set content = prefix ~ '\\n<tool_call>' ~ think_part.split('<tool_call>')[1:] | join('<tool_call>') %} | |\n| {%- else %} | |\n| {%- set reasoning_content = think_part %} | |\n| {%- set content = prefix %} | |\n| {%- endif %} | |\n| {%- endif %} | |\n| {%- endif %} | |\n| {%- set reasoning_content = reasoning_content | trim %} | |\n| {%- set content = content | trim %} | |\n| {#- Strip any leaked <tool_call> text from content; real tool_calls | |\n| come from the dedicated field. (Identical to upstream's intent | |\n| but expressed inline rather than relying on upstream's regex.) -#} | |\n| {%- if message.tool_calls and message.tool_calls is iterable and message.tool_calls is not mapping %} | |\n| {%- if '<tool_call>' in content %} | |\n| {%- set content = content.split('<tool_call>')[0] | trim %} | |\n| {%- endif %} | |\n| {%- endif %} | |\n| {#- Reasoning-emission gate. Mirrors upstream's structure exactly, | |\n| but with the Q1 default flip in effect: preserve_thinking | |\n| defaults true, so prior-turn <think> blocks survive. -#} | |\n| {%- if (preserve_thinking is defined and preserve_thinking is true) or (loop.index0 > ns.last_query_index) %} | |\n| {{- '<|im_start|>' + message.role + '\\n<think>\\n' + reasoning_content + '\\n</think>\\n\\n' + content }} | |\n| {%- else %} | |\n| {{- '<|im_start|>' + message.role + '\\n' + content }} | |\n| {%- endif %} | |\n| {%- if message.tool_calls and message.tool_calls is iterable and message.tool_calls is not mapping %} | |\n| {%- for tool_call in message.tool_calls %} | |\n| {%- if tool_call.function is defined %} | |\n| {%- set tool_call = tool_call.function %} | |\n| {%- endif %} | |\n| {%- if loop.first %} | |\n| {%- if content|trim %} | |\n| {{- '\\n\\n<tool_call>\\n<function=' + tool_call.name + '>\\n' }} | |\n| {%- else %} | |\n| {{- '<tool_call>\\n<function=' + tool_call.name + '>\\n' }} | |\n| {%- endif %} | |\n| {%- else %} | |\n| {{- '\\n<tool_call>\\n<function=' + tool_call.name + '>\\n' }} | |\n| {%- endif %} | |\n| {%- if tool_call.arguments is defined %} | |\n| {#- ---------------------------------------------------- | |\n| Q3 (public fork): debuggable raise on string args. | |\n| Upstream uses `tool_call.arguments | items` (line 120 | |\n| of upstream/chat_template.jinja). When arguments | |\n| is a JSON-encoded STRING — which is the wire-format | |\n| the OpenAI spec defines, and what some harness | |\n| adapters (notably the Vercel AI SDK used by | |\n| opencode) hand back to the harness — `.items` | |\n| raises: | |\n| \"Can only get item pairs from a mapping\" | |\n| which is impossible to debug without reading the | |\n| Jinja runtime source. | |\n| Q3 type-checks first and raises a clear error that | |\n| names the bug surface and links to the canonical | |\n| discussion. Harnesses MUST deserialize the JSON- | |\n| encoded arguments string exactly once on ingest | |\n| and store the resulting dict. See: | |\n| https://github.com/earendil-works/pi/issues/3325 | |\n| https://github.com/anomalyco/opencode/issues/24264 | |\n| For inputs where arguments is already a dict (the | |\n| well-formed case), behavior is identical to upstream. | |\n| ---------------------------------------------------- #} | |\n| {%- if tool_call.arguments is mapping %} | |\n| {%- for args_name, args_value in tool_call.arguments|items %} | |\n| {{- '<parameter=' + args_name + '>\\n' }} | |\n| {%- set args_value = args_value | string if args_value is string else args_value | tojson | safe %} | |\n| {{- args_value }} | |\n| {{- '\\n</parameter>\\n' }} | |\n| {%- endfor %} | |\n| {%- elif tool_call.arguments is string %} | |\n| {{- raise_exception( | |\n| \"custom_pub_chat_template_qwen36: \" | |\n| \"tool_call.arguments must be a JSON object \" | |\n| \"(mapping). Got a string. This is almost \" | |\n| \"always the harness handing back a JSON-\" | |\n| \"encoded STRING rather than the deserialized \" | |\n| \"object (common with Vercel AI SDK). \" | |\n| \"Deserialize once on ingest and store the \" | |\n| \"object. See: \" | |\n| \"github.com/earendil-works/pi/issues/3325\" | |\n| ) }} | |\n| {%- endif %} | |\n| {%- endif %} | |\n| {{- '</function>\\n</tool_call>' }} | |\n| {%- endfor %} | |\n| {%- endif %} | |\n| {{- '<|im_end|>\\n' }} | |\n| {%- elif message.role == \"tool\" %} | |\n| {%- if loop.previtem and loop.previtem.role != \"tool\" %} | |\n| {{- '<|im_start|>user' }} | |\n| {%- endif %} | |\n| {{- '\\n<tool_response>\\n' }} | |\n| {{- content }} | |\n| {{- '\\n</tool_response>' }} | |\n| {%- if not loop.last and loop.nextitem.role != \"tool\" %} | |\n| {{- '<|im_end|>\\n' }} | |\n| {%- elif loop.last %} | |\n| {{- '<|im_end|>\\n' }} | |\n| {%- endif %} | |\n| {%- else %} | |\n| {{- raise_exception('Unexpected message role.') }} | |\n| {%- endif %} | |\n| {%- endfor %} | |\n| {%- if add_generation_prompt %} | |\n| {{- '<|im_start|>assistant\\n' }} | |\n| {%- if enable_thinking is defined and enable_thinking is false %} | |\n| {{- '<think>\\n\\n</think>\\n\\n' }} | |\n| {%- else %} | |\n| {{- '<think>\\n' }} | |\n| {%- endif %} | |\n| {%- endif %} |", "url": "https://wpnews.pro/news/a-drop-in-replacement-chat-template-for-qwen-qwen3-6-27b-tuned-for-open-source", "canonical_source": "https://gist.github.com/jscott3201/e4b155885cc68c038d6ac8909a3bd9fe", "published_at": "2026-05-25 17:08:27+00:00", "updated_at": "2026-05-25 18:34:39.937169+00:00", "lang": "en", "topics": ["large-language-models", "artificial-intelligence", "ai-agents", "ai-tools", "ai-infrastructure"], "entities": ["Qwen", "Qwen3.6-27B", "anomalyco/opencode", "earendil-works/pi", "SGLang", "vLLM", "llama.cpp", "OpenHarness"], "alternates": {"html": "https://wpnews.pro/news/a-drop-in-replacement-chat-template-for-qwen-qwen3-6-27b-tuned-for-open-source", "markdown": "https://wpnews.pro/news/a-drop-in-replacement-chat-template-for-qwen-qwen3-6-27b-tuned-for-open-source.md", "text": "https://wpnews.pro/news/a-drop-in-replacement-chat-template-for-qwen-qwen3-6-27b-tuned-for-open-source.txt", "jsonld": "https://wpnews.pro/news/a-drop-in-replacement-chat-template-for-qwen-qwen3-6-27b-tuned-for-open-source.jsonld"}}