{"slug": "langgraph-vs-locus-stategraph-same-workflow-same-model-gpt-4o-mini-side-by-side", "title": "LangGraph vs Locus StateGraph — same workflow, same model (gpt-4o-mini), side by side. LangGraph→real OpenAI; Locus→OCI GenAI via BOAT-OC1.", "summary": "A developer built and ran an identical multi-step research workflow using both LangGraph (with OpenAI's real API) and Locus StateGraph (with OCI GenAI via BOAT-OC1), both calling gpt-4o-mini. The side-by-side test verified API equivalence claims against actual Locus source code, surfacing four documentation and API gaps that were collected into a follow-up issue list.", "body_md": "| #!/usr/bin/env python3 | |\n| \"\"\"LangGraph vs Locus StateGraph — same workflow, same model, side by side. | |\n| Workflow (in both frameworks) | |\n| ============================= | |\n| START → research → write → review →┐ | |\n| ▲ │ | |\n| └── if confidence < 0.85 ──┘ (max 3 iterations) | |\n| │ | |\n| END | |\n| State (in both) | |\n| =============== | |\n| topic: str — user input | |\n| notes: list[str] — accumulates across research iterations (REDUCER) | |\n| draft: str | |\n| confidence: float — review node's score | |\n| iter: int — loop counter | |\n| Model | |\n| ===== | |\n| Both sides call ``gpt-4o-mini``: | |\n| - LangGraph → real OpenAI API (``OPENAI_API_KEY``) | |\n| - Locus → OCI GenAI ``openai.gpt-4o-mini`` (BOAT-OC1 session token, | |\n| saasobservai compartment, ``us-chicago-1``) | |\n| Plus, this script verifies the API equivalence claims from the prose | |\n| comparison I wrote — every claim is exercised and any failures are | |\n| collected into ISSUES at the end. | |\n| Install | |\n| ======= | |\n| uv pip install \"locus-sdk[oci]==0.2.0b19\" langgraph langchain-openai | |\n| Env | |\n| === | |\n| OPENAI_API_KEY=sk-proj-… | |\n| OCI_PROFILE=BOAT-OC1 | |\n| OCI_COMPARTMENT_ID=ocid1.compartment.oc1..… # saasobservai prod | |\n| OCI_REGION=us-chicago-1 | |\n| \"\"\" | |\n| from __future__ import annotations | |\n| import asyncio | |\n| import operator | |\n| import os | |\n| import time | |\n| from typing import Annotated, Any, TypedDict | |\n| from pydantic import BaseModel, Field | |\n| # ---------- LangGraph ------------------------------------------------------ | |\n| from langgraph.graph import StateGraph as LGStateGraph, START as LG_START, END as LG_END | |\n| from langgraph.checkpoint.memory import MemorySaver as LGMemorySaver | |\n| from langchain_openai import ChatOpenAI | |\n| # ---------- Locus ---------------------------------------------------------- | |\n| from locus.multiagent import StateGraph as LocusStateGraph | |\n| from locus.multiagent.graph import START as LOCUS_START, END as LOCUS_END | |\n| from locus.models import get_model as locus_get_model | |\n| from locus.core.reducers import Reducer # noqa: F401 — exists check | |\n| TOPIC = \"Should a backend team standardising on Python adopt LangGraph in production?\" | |\n| TARGET_CONFIDENCE = 0.85 | |\n| MAX_ITER = 3 | |\n| RESEARCH_PROMPT = ( | |\n| \"You are a tight research analyst. Topic: {topic}\\n\" | |\n| \"Existing notes so far ({n_existing}): {notes_so_far}\\n\" | |\n| \"Produce TWO new, non-overlapping bullet-point findings (≤ 25 words each). \" | |\n| \"Numeric or factual claims preferred. Output the two bullets only.\" | |\n| ) | |\n| WRITE_PROMPT = ( | |\n| \"You are a technical writer. Topic: {topic}\\n\" | |\n| \"Research notes:\\n{notes}\\n\\n\" | |\n| \"Write a 3-sentence verdict paragraph. Be concrete and decisive.\" | |\n| ) | |\n| REVIEW_PROMPT = ( | |\n| \"You are a sceptical editor. Topic: {topic}\\nDraft:\\n{draft}\\n\\n\" | |\n| \"Score the draft from 0.0 to 1.0 on (factual confidence + decisiveness + \" | |\n| \"coverage). Reply with ONLY the number, e.g. '0.82'. No explanation.\" | |\n| ) | |\n| # =========================================================================== | |\n| # Model adapters — give each framework a sync `complete(prompt) -> str` lambda | |\n| # =========================================================================== | |\n| def _make_openai_complete(): | |\n| llm = ChatOpenAI(model=\"gpt-4o-mini\", api_key=os.environ[\"OPENAI_API_KEY\"], temperature=0.2) | |\n| def complete(prompt: str) -> str: | |\n| return llm.invoke(prompt).content.strip() | |\n| return complete | |\n| def _make_oci_complete(): | |\n| model = locus_get_model( | |\n| \"oci:openai.gpt-4o-mini\", | |\n| profile=os.environ.get(\"OCI_PROFILE\", \"BOAT-OC1\"), | |\n| compartment_id=os.environ[\"OCI_COMPARTMENT_ID\"], | |\n| region=os.environ.get(\"OCI_REGION\", \"us-chicago-1\"), | |\n| max_tokens=400, | |\n| temperature=0.2, | |\n| ) | |\n| # locus.ModelProtocol exposes async `complete(messages, tools=None)`. | |\n| from locus.core.messages import Message | |\n| def complete(prompt: str) -> str: | |\n| msgs = [Message(role=\"user\", content=prompt)] | |\n| resp = asyncio.run(model.complete(msgs)) | |\n| # ModelResponse → text | |\n| return (resp.content or \"\").strip() | |\n| return complete | |\n| def _parse_score(text: str) -> float: | |\n| for tok in text.replace(\",\", \" \").split(): | |\n| try: | |\n| v = float(tok) | |\n| if 0.0 <= v <= 1.0: | |\n| return v | |\n| except ValueError: | |\n| continue | |\n| return 0.5 # fallback | |\n| # =========================================================================== | |\n| # IMPLEMENTATION A — LangGraph | |\n| # =========================================================================== | |\n| class LGState(TypedDict, total=False): | |\n| topic: str | |\n| notes: Annotated[list[str], operator.add] # reducer = list append | |\n| draft: str | |\n| confidence: float | |\n| iter: int | |\n| def build_langgraph(): | |\n| complete = _make_openai_complete() | |\n| def research(state: LGState) -> dict: | |\n| notes_so_far = \"\\n\".join(state.get(\"notes\", [])) or \"(none)\" | |\n| out = complete(RESEARCH_PROMPT.format( | |\n| topic=state[\"topic\"], n_existing=len(state.get(\"notes\", [])), | |\n| notes_so_far=notes_so_far, | |\n| )) | |\n| new_notes = [l.strip(\"-• \").strip() for l in out.splitlines() if l.strip()] | |\n| return {\"notes\": new_notes[:2], \"iter\": state.get(\"iter\", 0) + 1} | |\n| def write(state: LGState) -> dict: | |\n| notes = \"\\n\".join(f\"- {n}\" for n in state[\"notes\"]) | |\n| return {\"draft\": complete(WRITE_PROMPT.format(topic=state[\"topic\"], notes=notes))} | |\n| def review(state: LGState) -> dict: | |\n| score_text = complete(REVIEW_PROMPT.format(topic=state[\"topic\"], draft=state[\"draft\"])) | |\n| return {\"confidence\": _parse_score(score_text)} | |\n| def decide(state: LGState) -> str: | |\n| if state[\"confidence\"] >= TARGET_CONFIDENCE or state.get(\"iter\", 0) >= MAX_ITER: | |\n| return LG_END | |\n| return \"research\" | |\n| g = LGStateGraph(LGState) | |\n| g.add_node(\"research\", research) | |\n| g.add_node(\"write\", write) | |\n| g.add_node(\"review\", review) | |\n| g.add_edge(LG_START, \"research\") | |\n| g.add_edge(\"research\", \"write\") | |\n| g.add_edge(\"write\", \"review\") | |\n| g.add_conditional_edges(\"review\", decide, {\"research\": \"research\", LG_END: LG_END}) | |\n| return g.compile(checkpointer=LGMemorySaver()) | |\n| # =========================================================================== | |\n| # IMPLEMENTATION B — Locus StateGraph | |\n| # =========================================================================== | |\n| class LocusState(BaseModel): | |\n| topic: str = \"\" | |\n| notes: list[str] = Field(default_factory=list) | |\n| draft: str = \"\" | |\n| confidence: float = 0.0 | |\n| iter: int = 0 | |\n| def build_locus_graph(): | |\n| complete = _make_oci_complete() | |\n| def research(state: dict) -> dict: | |\n| notes_so_far = \"\\n\".join(state.get(\"notes\", []) or []) or \"(none)\" | |\n| out = complete(RESEARCH_PROMPT.format( | |\n| topic=state[\"topic\"], n_existing=len(state.get(\"notes\", []) or []), | |\n| notes_so_far=notes_so_far, | |\n| )) | |\n| new_notes = [l.strip(\"-• \").strip() for l in out.splitlines() if l.strip()][:2] | |\n| # Manual append — Locus' reducer behavior is verified separately below. | |\n| merged = list(state.get(\"notes\", []) or []) + new_notes | |\n| return {\"notes\": merged, \"iter\": (state.get(\"iter\", 0) or 0) + 1} | |\n| def write(state: dict) -> dict: | |\n| notes = \"\\n\".join(f\"- {n}\" for n in (state.get(\"notes\") or [])) | |\n| return {\"draft\": complete(WRITE_PROMPT.format(topic=state[\"topic\"], notes=notes))} | |\n| def review(state: dict) -> dict: | |\n| score_text = complete(REVIEW_PROMPT.format(topic=state[\"topic\"], draft=state[\"draft\"])) | |\n| return {\"confidence\": _parse_score(score_text)} | |\n| def decide(state: dict) -> str: | |\n| if (state.get(\"confidence\") or 0.0) >= TARGET_CONFIDENCE or (state.get(\"iter\") or 0) >= MAX_ITER: | |\n| return LOCUS_END | |\n| return \"research\" | |\n| g = LocusStateGraph(state_schema=LocusState) | |\n| g.add_node(\"research\", research) | |\n| g.add_node(\"write\", write) | |\n| g.add_node(\"review\", review) | |\n| g.add_edge(LOCUS_START, \"research\") | |\n| g.add_edge(\"research\", \"write\") | |\n| g.add_edge(\"write\", \"review\") | |\n| g.add_conditional_edges(\"review\", decide, {\"research\": \"research\", LOCUS_END: LOCUS_END}) | |\n| return g.compile() | |\n| # =========================================================================== | |\n| # Driver | |\n| # =========================================================================== | |\n| def run_langgraph(graph) -> dict: | |\n| t0 = time.monotonic() | |\n| state = graph.invoke( | |\n| {\"topic\": TOPIC, \"notes\": [], \"iter\": 0}, | |\n| config={\"configurable\": {\"thread_id\": \"t1\"}, \"recursion_limit\": 50}, | |\n| ) | |\n| return {**state, \"_elapsed\": time.monotonic() - t0} | |\n| def run_locus(graph) -> dict: | |\n| # NOTE: docs example uses `graph.compile().run_sync(...)` but only async | |\n| # entry points exist (`ainvoke`, `astream`, `execute`, `stream`). Adapter: | |\n| t0 = time.monotonic() | |\n| result = asyncio.run(graph.ainvoke({\"topic\": TOPIC, \"notes\": [], \"iter\": 0})) | |\n| state = result.final_state if hasattr(result, \"final_state\") else result | |\n| if hasattr(state, \"model_dump\"): | |\n| state = state.model_dump() | |\n| return {**state, \"_elapsed\": time.monotonic() - t0} | |\n| # =========================================================================== | |\n| # Equivalence verifier — runs my prose claims past reality | |\n| # =========================================================================== | |\n| def verify_equivalence_claims() -> list[str]: | |\n| \"\"\"Returns a list of failure strings — empty means every claim checks out.\"\"\" | |\n| fails: list[str] = [] | |\n| add = fails.append | |\n| # Claim: builder name + signature parity | |\n| try: | |\n| from locus.multiagent import StateGraph as LG_locus # noqa: N814 | |\n| LG_locus(state_schema=LocusState) | |\n| except Exception as e: # noqa: BLE001 | |\n| add(f\"StateGraph(state_schema=…) builder shape failed in Locus: {e!r}\") | |\n| # Claim: Send / Command / interrupt importable from claimed paths | |\n| try: | |\n| from locus.core.send import Send # noqa: F401 | |\n| from locus.core.command import Command # noqa: F401 | |\n| from locus.core.interrupt import interrupt # noqa: F401 | |\n| except Exception as e: # noqa: BLE001 | |\n| add(f\"Send/Command/interrupt import paths failed: {e!r}\") | |\n| # Claim: Functional API (@entrypoint / @task) | |\n| try: | |\n| from locus.multiagent.functional import entrypoint, task # noqa: F401 | |\n| except Exception as e: # noqa: BLE001 | |\n| add(f\"@entrypoint / @task not importable from locus.multiagent.functional: {e!r}\") | |\n| # Claim: Per-node RetryPolicy + CachePolicy | |\n| try: | |\n| from locus.multiagent.graph import RetryPolicy, CachePolicy # noqa: F401 | |\n| except Exception as e: # noqa: BLE001 | |\n| add(f\"RetryPolicy/CachePolicy not at locus.multiagent.graph: {e!r}\") | |\n| # Claim: draw_mermaid + draw_ascii both ship | |\n| try: | |\n| from locus.multiagent.visualize import draw_mermaid, draw_ascii # noqa: F401 | |\n| except Exception as e: # noqa: BLE001 | |\n| add(f\"draw_mermaid/draw_ascii not importable: {e!r}\") | |\n| # Claim: GraphConfig holds interrupt_before/interrupt_after/checkpointer | |\n| try: | |\n| from locus.multiagent.graph import GraphConfig | |\n| cfg = GraphConfig(interrupt_before=[\"a\"], interrupt_after=[\"b\"], checkpointer=None) | |\n| assert cfg.interrupt_before == [\"a\"] and cfg.interrupt_after == [\"b\"] | |\n| except Exception as e: # noqa: BLE001 | |\n| add(f\"GraphConfig interrupt_before/after wiring failed: {e!r}\") | |\n| # Claim: StreamMode enum has values / updates / nodes / custom | |\n| try: | |\n| from locus.multiagent.graph import StreamMode | |\n| expected = {\"values\", \"updates\", \"nodes\", \"custom\"} | |\n| present = {m.value for m in StreamMode} | |\n| missing = expected - present | |\n| if missing: | |\n| add(f\"StreamMode missing modes: {missing} (got {present})\") | |\n| except Exception as e: # noqa: BLE001 | |\n| add(f\"StreamMode not importable: {e!r}\") | |\n| # Claim: Reducer-from-Pydantic extraction (extract_reducers_from_model) | |\n| try: | |\n| from locus.core.reducers import extract_reducers_from_model # noqa: F401 | |\n| except Exception as e: # noqa: BLE001 | |\n| add(f\"extract_reducers_from_model missing from locus.core.reducers: {e!r}\") | |\n| # Build a compiled graph once for the next several checks. | |\n| from locus.multiagent.graph import START as _S, END as _E | |\n| g0 = LocusStateGraph(state_schema=LocusState) | |\n| g0.add_node(\"n\", lambda s: {}) | |\n| g0.add_edge(_S, \"n\"); g0.add_edge(\"n\", _E) | |\n| compiled = g0.compile() | |\n| # Claim from docs/concepts/multi-agent/graph.md (line ~73): | |\n| # `result = graph.compile().run_sync({...})` | |\n| if not hasattr(compiled, \"run_sync\"): | |\n| add(\"docs/concepts/multi-agent/graph.md shows `compiled.run_sync(...)` \" | |\n| \"but no such method exists on the compiled graph \" | |\n| \"(only async: ainvoke/astream/execute/stream).\") | |\n| # Claim from same doc: `graph.compile().get_mermaid()` | |\n| if not hasattr(compiled, \"get_mermaid\"): | |\n| add(\"docs/concepts/multi-agent/graph.md shows `compiled.get_mermaid()` \" | |\n| \"but the actual API is `from locus.multiagent.visualize import \" | |\n| \"draw_mermaid; draw_mermaid(compiled)`.\") | |\n| # Claim from prose: sync `invoke()` parity with LangGraph's CompiledGraph | |\n| if not hasattr(compiled, \"invoke\"): | |\n| add(\"No sync `compiled.invoke(...)` — LangGraph users expect this \" | |\n| \"as the standard sync entry. Locus exposes only async ainvoke().\") | |\n| # Verify get_graph() returns something useful for rendering. | |\n| if hasattr(compiled, \"get_graph\"): | |\n| sub = compiled.get_graph() | |\n| if not hasattr(sub, \"draw_mermaid\"): | |\n| add(\"`compiled.get_graph()` returns a StateGraph rather than a \" | |\n| \"render-capable object — LangGraph's CompiledGraph.get_graph() \" | |\n| \"returns a Graph with `.draw_mermaid()` / `.draw_ascii()` / \" | |\n| \"`.draw_png()`. Convenience gap.\") | |\n| return fails | |\n| # =========================================================================== | |\n| # Main | |\n| # =========================================================================== | |\n| def main() -> None: | |\n| for var in (\"OPENAI_API_KEY\", \"OCI_COMPARTMENT_ID\"): | |\n| if not os.environ.get(var): | |\n| raise SystemExit(f\"{var} not set\") | |\n| print(\"=\" * 76) | |\n| print(\" LANGGRAPH vs LOCUS — same workflow, same model family (gpt-4o-mini)\") | |\n| print(f\" topic : {TOPIC}\") | |\n| print(f\" target conf : {TARGET_CONFIDENCE} max iter: {MAX_ITER}\") | |\n| print(\"=\" * 76) | |\n| # ----- Equivalence claim check ---------------------------------------- | |\n| print(\"\\n— API equivalence claims (verified against current Locus source) —\") | |\n| fails = verify_equivalence_claims() | |\n| if not fails: | |\n| print(\" ✓ Every claim from the prose comparison resolves cleanly.\") | |\n| else: | |\n| print(f\" ✗ {len(fails)} claim(s) didn't hold up:\") | |\n| for f in fails: | |\n| print(f\" • {f}\") | |\n| # ----- Run both graphs ------------------------------------------------- | |\n| print(\"\\n— Building + running LangGraph (real OpenAI gpt-4o-mini) —\") | |\n| lg_graph = build_langgraph() | |\n| lg_state = run_langgraph(lg_graph) | |\n| print(f\" iters={lg_state['iter']} notes={len(lg_state['notes'])} \" | |\n| f\"conf={lg_state['confidence']:.2f} elapsed={lg_state['_elapsed']:.1f}s\") | |\n| print(\"\\n— Building + running Locus StateGraph (OCI gpt-4o-mini via BOAT-OC1) —\") | |\n| locus_graph = build_locus_graph() | |\n| locus_state = run_locus(locus_graph) | |\n| print(f\" iters={locus_state['iter']} notes={len(locus_state['notes'])} \" | |\n| f\"conf={locus_state['confidence']:.2f} elapsed={locus_state['_elapsed']:.1f}s\") | |\n| # ----- Side-by-side ---------------------------------------------------- | |\n| print(\"\\n\" + \"=\" * 76) | |\n| print(\" SIDE-BY-SIDE FINAL STATE\") | |\n| print(\"=\" * 76) | |\n| rows: list[tuple[str, Any, Any]] = [ | |\n| (\"iters\", lg_state[\"iter\"], locus_state[\"iter\"]), | |\n| (\"notes\", len(lg_state[\"notes\"]), len(locus_state[\"notes\"])), | |\n| (\"confidence\", f\"{lg_state['confidence']:.2f}\", | |\n| f\"{locus_state['confidence']:.2f}\"), | |\n| (\"elapsed s\", f\"{lg_state['_elapsed']:.2f}\", | |\n| f\"{locus_state['_elapsed']:.2f}\"), | |\n| ] | |\n| print(f\" {'metric':<14}{'LangGraph':<20}{'Locus':<20}\") | |\n| for name, a, b in rows: | |\n| print(f\" {name:<14}{str(a):<20}{str(b):<20}\") | |\n| print(\"\\n— LangGraph draft —\") | |\n| print(\" \" + lg_state[\"draft\"].replace(\"\\n\", \"\\n \")) | |\n| print(\"\\n— Locus draft —\") | |\n| print(\" \" + locus_state[\"draft\"].replace(\"\\n\", \"\\n \")) | |\n| # ----- Mermaid from each ---------------------------------------------- | |\n| print(\"\\n— Mermaid (LangGraph) —\") | |\n| try: | |\n| print(lg_graph.get_graph().draw_mermaid()) | |\n| except Exception as e: # noqa: BLE001 | |\n| print(f\" (LangGraph Mermaid raised: {e!r})\") | |\n| print(\"— Mermaid (Locus) —\") | |\n| try: | |\n| from locus.multiagent.visualize import draw_mermaid as locus_draw_mermaid | |\n| print(locus_draw_mermaid(locus_graph)) | |\n| except Exception as e: # noqa: BLE001 | |\n| print(f\" (Locus Mermaid raised: {e!r})\") | |\n| # ----- Verdict --------------------------------------------------------- | |\n| print(\"\\n\" + \"=\" * 76) | |\n| if fails: | |\n| print(f\" RESULT: workflow ran in both; {len(fails)} doc/API gap(s) for Locus follow-up:\") | |\n| print(\"=\" * 76) | |\n| for i, f in enumerate(fails, 1): | |\n| print(f\"\\n Issue #{i}\\n {f}\") | |\n| else: | |\n| print(\" RESULT: workflow ran in both; every doc claim verified.\") | |\n| print(\"=\" * 76) | |\n| if __name__ == \"__main__\": | |\n| main() |", "url": "https://wpnews.pro/news/langgraph-vs-locus-stategraph-same-workflow-same-model-gpt-4o-mini-side-by-side", "canonical_source": "https://gist.github.com/fede-kamel/cb45aeac259bb995f8be23a5d7ca6965", "published_at": "2026-05-22 00:59:29+00:00", "updated_at": "2026-05-26 00:34:33.230291+00:00", "lang": "en", "topics": ["large-language-models", "ai-agents", "ai-tools", "ai-infrastructure", "generative-ai"], "entities": ["LangGraph", "Locus StateGraph", "OpenAI", "OCI GenAI", "BOAT-OC1", "gpt-4o-mini", "Locus SDK", "saasobservai"], "alternates": {"html": "https://wpnews.pro/news/langgraph-vs-locus-stategraph-same-workflow-same-model-gpt-4o-mini-side-by-side", "markdown": "https://wpnews.pro/news/langgraph-vs-locus-stategraph-same-workflow-same-model-gpt-4o-mini-side-by-side.md", "text": "https://wpnews.pro/news/langgraph-vs-locus-stategraph-same-workflow-same-model-gpt-4o-mini-side-by-side.txt", "jsonld": "https://wpnews.pro/news/langgraph-vs-locus-stategraph-same-workflow-same-model-gpt-4o-mini-side-by-side.jsonld"}}