How to Build a Document Intelligence Backend with iii Using Workers, Functions, and Cron Triggers

A new tutorial demonstrates how to build a document intelligence backend using the iii framework, combining Workers, Functions, and Cron Triggers for text analysis workflows. The guide walks through installing the iii engine, connecting Python workers, and registering functions for text normalization, tokenization, sentiment analysis, keyword extraction, reporting, and heartbeat tracking. Developers can then execute the same logic through direct invocation, HTTP endpoints, fire-and-forget execution, or scheduled cron triggers, creating a production-like backend system rather than a static notebook demo.

In this tutorial, we build a document-intelligence workflow with iii https://github.com/iii-hq/iii . We begin by installing the iii engine and Python SDK, then start the engine as a background process and connect a Python worker to it. After the setup, we register separate functions for text normalization, tokenization, sentiment analysis, keyword extraction, reporting, and heartbeat tracking. We then combine these functions into a single analysis pipeline and run the same logic via direct invocation, an HTTP endpoint, fire-and-forget execution, and a scheduled cron trigger. Along the way, we also track basic runtime state, making the workflow feel closer to a real backend system than a static notebook demo. Check out the FULL CODES here https://github.com/MARKTECHPOST-AI-MEDIA-INC/AI-Agents-Projects-Tutorials/blob/main/Distributed%20Systems/iii live document intelligence backend marktechpost.py . python import os, sys, subprocess, time, socket, json, threading from collections import Counter HOME = os.path.expanduser "~" BIN DIR = f"{HOME}/.local/bin" os.environ "PATH" = BIN DIR + os.pathsep + os.environ.get "PATH", "" def sh cmd : print f"$ {cmd}" subprocess.run cmd, shell=True, check=True if not os.path.exists f"{BIN DIR}/iii" : sh f"curl -fsSL https://install.iii.dev/iii/main/install.sh | BIN DIR={BIN DIR} sh" sh f"{sys.executable} -m pip install -q iii-sdk requests" III = f"{BIN DIR}/iii" sh f"{III} --version" We start by importing the required Python modules and setting up the local binary path for the III engine. We define a small helper function to run shell commands and install the III engine if it is not already available. We also install the Python SDK and requests package, then verify the iii installation by checking its version. WS URL, HTTP URL = "ws://localhost:49134", "http://localhost:3111" engine log = open "/tmp/iii-engine.log", "w" engine = subprocess.Popen III, "--use-default-config" , stdout=engine log, stderr=subprocess.STDOUT def wait port host, port, timeout=90 : end = time.time + timeout while time.time < end: with socket.socket as s: s.settimeout 1 try: s.connect host, port ; return True except OSError: time.sleep 0.5 return False assert wait port "localhost", 49134 , "engine never came up — see /tmp/iii-engine.log" print f"✓ engine up — WS {WS URL} | HTTP {HTTP URL}" from iii import register worker try: from iii import TriggerAction except Exception: TriggerAction = None worker = register worker WS URL STATE = {"docs analyzed": 0, "heartbeats": 0, "keyword totals": Counter } LOCK = threading.Lock POSITIVE = {"good","great","love","excellent","happy","fast","reliable","amazing","best","win"} NEGATIVE = {"bad","terrible","hate","slow","broken","sad","worst","bug","crash","fail"} We launch the iii engine as a background process and wait for its WebSocket port to become available. We then connect a Python worker to the running engine and prepare optional support for fire-and-forget triggers. We also define a shared in-memory state, a thread lock, and simple positive and negative word sets for sentiment analysis. python def normalize data : return {"text": data.get "text" or "" .strip .lower } def tokenize data : text = data.get "text", "" cleaned = "".join c if c.isalnum or c.isspace else " " for c in text tokens = t for t in cleaned.split if t return {"tokens": tokens, "count": len tokens } def sentiment data : toks = data.get "tokens", pos = sum t in POSITIVE for t in toks neg = sum t in NEGATIVE for t in toks score = pos - neg label = "positive" if score 0 else "negative" if score < 0 else "neutral" return {"label": label, "score": score, "pos": pos, "neg": neg} def keywords data : toks = data.get "tokens", stop = {"the","a","an","is","it","to","of","and","in","for","on","how"} freq = Counter t for t in toks if t not in stop and len t 2 return {"keywords": freq.most common data.get "top n", 5 } def analyze data : norm = worker.trigger {"function id": "text::normalize", "payload": {"text": data.get "text","" }} toks = worker.trigger {"function id": "text::tokenize", "payload": norm} sent = worker.trigger {"function id": "text::sentiment", "payload": toks} keys = worker.trigger {"function id": "text::keywords", "payload": { toks, "top n": data.get "top n", 5 }} with LOCK: STATE "docs analyzed" += 1 for k, c in keys "keywords" : STATE "keyword totals" k += c n = STATE "docs analyzed" return {"tokens": toks "count" , "sentiment": sent, "keywords": keys "keywords" , "docs analyzed": n} def report data : with LOCK: return {"docs analyzed": STATE "docs analyzed" , "heartbeats": STATE "heartbeats" , "top keywords all docs": STATE "keyword totals" .most common 5 } def http analyze data : body = data.get "body" or {} result = worker.trigger {"function id": "pipeline::analyze", "payload": body} return {"status code": 200, "body": result, "headers": {"Content-Type": "application/json"}} def heartbeat data : with LOCK: STATE "heartbeats" += 1 return {"ok": True} for fid, fn in "text::normalize", normalize , "text::tokenize", tokenize , "text::sentiment", sentiment , "text::keywords", keywords , "pipeline::analyze", analyze , "stats::report", report , "http::analyze", http analyze , "cron::heartbeat", heartbeat , : worker.register function fid, fn We define the core functions used in the text-analysis workflow, including normalization, tokenization, sentiment detection, and keyword extraction. We then create an analysis function that routes each step through the III engine instead of calling everything directly. We also add reporting, HTTP handling, and heartbeat functions before registering all of them with the worker. worker.register trigger {"type": "http", "function id": "http::analyze", "config": {"api path": "/analyze", "http method": "POST"}} cron ok = False try: worker.register trigger {"type": "cron", "function id": "cron::heartbeat", "config": {"schedule": " /2 "}} cron ok = True except Exception as e: print "cron trigger skipped:", e try: worker.connect except Exception: pass time.sleep 2 We register an HTTP trigger so that the analysis pipeline can be invoked via a POST request. We also try to register a cron trigger that runs the heartbeat function on a fixed schedule, while safely skipping it if the engine build does not support that schema. We then connect the worker and pause briefly so the registered functions and triggers are ready to use. print "\n=== A Direct invocation — orchestrated through the engine ===" docs = "iii makes the backend amazing and fast, I love how reliable it is", "The legacy gateway was slow and broken, a terrible buggy experience", "Workers register functions and triggers; the engine routes every call", for d in docs: r = worker.trigger {"function id": "pipeline::analyze", "payload": {"text": d, "top n": 4}} print f" {r 'sentiment' 'label' : 8} tokens={r 'tokens' : 2} keywords={r 'keywords' }" print "\n=== B The SAME function over HTTP :3111 — zero handler changes ===" import requests try: resp = requests.post f"{HTTP URL}/analyze", json={"text": "great great product, best ever", "top n": 3}, timeout=10 print " HTTP", resp.status code, "- ", resp.json except Exception as e: print " HTTP call failed engine HTTP module/version? :", e print "\n=== C Fire-and-forget invocation ===" if TriggerAction: worker.trigger {"function id": "pipeline::analyze", "payload": {"text": "async win, no waiting"}, "action": TriggerAction.Void } print " dispatched no result awaited " else: print " TriggerAction not in this SDK build — skipping" print "\n=== D Cron trigger firing on its own ===" if cron ok: time.sleep 5 print " heartbeats so far:", worker.trigger {"function id": "stats::report", "payload": {}} "heartbeats" else: print " cron not registered on this engine build" print "\n=== E Aggregate state report ===" print json.dumps worker.trigger {"function id": "stats::report", "payload": {}} , indent=2 print "\nTraces/metrics: run iii console locally, or scrape Prometheus at :9464" print "engine log tail:" print subprocess.run "tail", "-n", "8", "/tmp/iii-engine.log" , capture output=True, text=True .stdout We test the complete III workflow by sending sample text documents through the registered analysis pipeline. We then call the same logic through HTTP, try fire-and-forget execution, and check whether the cron heartbeat is running. Finally, we print the aggregate state report and show the engine log tail for basic runtime visibility. In conclusion, we have a working III system that processes text using modular, registered functions rather than a single fixed script. We analyzed sample documents, exposed the pipeline through HTTP, tested async-style execution, tracked heartbeat activity, and printed an aggregate state report. The tutorial keeps the example readable while showing the main working pattern of iii: define functions once, register them with a worker, and reuse them through different triggers and execution paths. It also shows how small functions can be cleanly connected as the workflow grows into something more production-ready. Check out the FULL CODES here. Also, feel free to follow us on and don’t forget to join our Twitter https://x.com/intent/follow?screen name=marktechpost and Subscribe to 150k+ ML SubReddit https://www.reddit.com/r/machinelearningnews/ . Wait are you on telegram? our Newsletter https://www.aidevsignals.com/ now you can join us on telegram as well. https://t.me/machinelearningresearchnews Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us https://forms.gle/wbash1wF6efRj8G58 Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions. - Sana Hassan - Sana Hassan - Sana Hassan - Sana Hassan