{"slug": "i-built-a-3b-lease-risk-scanner-that-runs-without-an-external-llm-api", "title": "I built a 3B lease risk scanner that runs without an external LLM API", "summary": "A developer built Lease Lens, a 3B-parameter contract risk scanner that runs entirely without an external LLM API, for the Hugging Face Build Small Hackathon. The fine-tuned Llama 3.2 3B model achieved a 242% relative F1 improvement over the base model and outperformed an 8B fine-tune on legal clause extraction. Lease Lens analyzes leases for risky clauses, highlights them in the source text, and drafts negotiation emails, all while keeping private data local.", "body_md": "I built **Lease Lens** for the [Hugging Face Build Small Hackathon](https://huggingface.co/build-small-hackathon).\n\nThe idea is simple: most people sign contracts they do not really read.\n\nThat is true for apartment leases, freelance agreements, gym memberships, SaaS terms, and small-business office leases. The risk is not that every contract is malicious. The risk is that a normal person can miss a renewal clause, late-fee stack, deposit condition, indemnity clause, repair burden, or arbitration waiver until it is too late.\n\nLease Lens is a small-model contract review assistant. It reads a lease or contract, finds risky clauses, quotes the exact language, highlights it in the source text, scores the contract, and drafts a plain-English negotiation email.\n\nDemo: [https://youtu.be/M-v3OAKO5-k](https://youtu.be/M-v3OAKO5-k)\n\nSpace: [https://huggingface.co/spaces/build-small-hackathon/lease-lens](https://huggingface.co/spaces/build-small-hackathon/lease-lens)\n\nGitHub: [https://github.com/bO-05/lease-lens](https://github.com/bO-05/lease-lens)\n\nModel: [https://huggingface.co/giladam01/lease-lens-legal-3b](https://huggingface.co/giladam01/lease-lens-legal-3b)\n\nGGUF: [https://huggingface.co/giladam01/lease-lens-legal-3b-gguf](https://huggingface.co/giladam01/lease-lens-legal-3b-gguf)\n\nFor this problem, the small-model constraint is not just a hackathon rule. It is part of the product.\n\nContracts can contain private addresses, payments, business terms, and personal details. A user should not have to send that text to a closed external LLM API just to understand whether a lease contains obvious risk.\n\nLease Lens runs the model inside the Hugging Face Space and also ships a GGUF build for local llama.cpp / Ollama usage. The app does **not** call an external LLM API.\n\nThat gives the project a clear target:\n\nThe app checks for common contract risk categories:\n\nFor every accepted flag, Lease Lens shows:\n\nThen it can draft a negotiation email from the grounded flags.\n\nIt is not legal advice. It is a review assistant: evidence first, user judgment second.\n\nThe shipped model is a fine-tuned **Llama 3.2 3B** legal extraction model.\n\nI fine-tuned on CUAD-style legal clause extraction and evaluated on 100 held-out CUAD extraction items with the same setup across models.\n\nThe headline result:\n\n| Model | F1 | Exact match |\n|---|---|---|\n| Llama 3.2 3B base | 0.119 | 0.010 |\n| Lease Lens 3B | 0.406 | 0.280 |\n| Llama 3.1 8B base | 0.206 | 0.020 |\n| my 8B fine-tune | 0.357 | 0.230 |\n\nThe 3B fine-tune improved F1 by about **+242% relative over the base 3B model** and even beat my own 8B fine-tune on the same held-out items.\n\nThat is the part I like most about the project: small did not mean worse by default. For a specific extraction task, a tuned 3B model was enough to become useful.\n\nThe first version had an important failure mode: when trained mostly on positive examples, the bare model over-extracted on absent clause types. In other words, it was too eager to find something.\n\nSo the app does not trust generation alone.\n\nLease Lens wraps the model with deterministic guards:\n\nFor long contracts, the app reads the first 80k characters, splits the text into overlapping windows, routes each clause category only to windows containing relevant keywords, and runs the checks as a batched generation call.\n\nThis makes the output less magical, but much more inspectable. A user can look at the quote, look at the highlighted source text, and decide whether it matters.\n\nThe Space includes real executed commercial leases from SEC EDGAR filings.\n\nThat matters because benchmark scores are not enough. A demo can look good on short synthetic examples and then fall apart on actual legal documents.\n\nThe built-in examples include:\n\nThe Boston example is a good quick demo: Lease Lens finds 3 grounded flags and catches the exact `$125,301.33`\n\nsecurity-deposit clause.\n\nThe Addison example is a stress test: long text, partial coverage, and enough complexity to show why the UI needs to be evidence-first instead of just a chatbot answer.\n\nI started with a Gradio app, but the final submission needed to feel less like a stock demo and more like a focused tool.\n\nThe current UI is a \"redline legal evidence desk\":\n\nThe goal is that a judge can understand the whole product path in under a minute:\n\nI used Modal for the v2.5 training path and smoke verification.\n\nThe smoke run used an A100-40GB, loaded a CUAD smoke split of 400 positives and 100 synthesized NONE examples, trained for 60 steps, and completed cleanly in about 160 seconds. I kept the run as `--no-push`\n\nevidence so it verified the Modal path without overwriting the published model.\n\nThe repo also includes the training script:\n\n[https://github.com/bO-05/lease-lens/blob/main/training/finetune_legal_3b_modal_v2.py](https://github.com/bO-05/lease-lens/blob/main/training/finetune_legal_3b_modal_v2.py)\n\nFor local usage, I published a GGUF build:\n\n```\nollama pull hf.co/giladam01/lease-lens-legal-3b-gguf\n```\n\nI also built/finalized the submission with OpenAI Codex as my coding agent. The public GitHub history contains Codex-attributed commits, and the repo includes a Codex build log:\n\n[https://github.com/bO-05/lease-lens/blob/main/docs/codex-build-log.md](https://github.com/bO-05/lease-lens/blob/main/docs/codex-build-log.md)\n\nThere are still obvious next steps:\n\nThe big lesson for me was that a small legal model can be useful if the product does not ask it to be a lawyer.\n\nAsk it to extract. Ground the quote. Highlight the evidence. Show the limitation. Let the human decide.\n\nThat is the shape of Lease Lens.\n\nDemo: [https://youtu.be/M-v3OAKO5-k](https://youtu.be/M-v3OAKO5-k)\n\nLive Space: [https://huggingface.co/spaces/build-small-hackathon/lease-lens](https://huggingface.co/spaces/build-small-hackathon/lease-lens)\n\nGitHub: [https://github.com/bO-05/lease-lens](https://github.com/bO-05/lease-lens)\n\nModel: [https://huggingface.co/giladam01/lease-lens-legal-3b](https://huggingface.co/giladam01/lease-lens-legal-3b)\n\nField notes: [https://huggingface.co/blog/giladam01/lease-lens-article](https://huggingface.co/blog/giladam01/lease-lens-article)", "url": "https://wpnews.pro/news/i-built-a-3b-lease-risk-scanner-that-runs-without-an-external-llm-api", "canonical_source": "https://dev.to/asynchronope/i-built-a-3b-lease-risk-scanner-that-runs-without-an-external-llm-api-170a", "published_at": "2026-06-14 19:36:59+00:00", "updated_at": "2026-06-14 20:11:06.933919+00:00", "lang": "en", "topics": ["large-language-models", "artificial-intelligence", "developer-tools", "ai-products"], "entities": ["Hugging Face", "Llama 3.2 3B", "Lease Lens", "CUAD", "SEC EDGAR", "llama.cpp", "Ollama"], "alternates": {"html": "https://wpnews.pro/news/i-built-a-3b-lease-risk-scanner-that-runs-without-an-external-llm-api", "markdown": "https://wpnews.pro/news/i-built-a-3b-lease-risk-scanner-that-runs-without-an-external-llm-api.md", "text": "https://wpnews.pro/news/i-built-a-3b-lease-risk-scanner-that-runs-without-an-external-llm-api.txt", "jsonld": "https://wpnews.pro/news/i-built-a-3b-lease-risk-scanner-that-runs-without-an-external-llm-api.jsonld"}}