Show HN: Model Due Diligence A developer released model-due-diligence, an open-source Python CLI tool that performs static supply-chain security checks on local AI model files and repositories before they are imported into runtimes like Ollama or llama.cpp. The tool scans for unsafe serialization, suspicious content, exposed secrets, and weak provenance, generating reports to help users identify obvious risks without loading or executing the models. model-due-diligence is a Python command-line tool for performing static supply-chain due diligence on local AI model files and cloned model repositories before they are imported into runtimes such as Ollama, llama.cpp, LM Studio or Transformers. It is designed to help answer one practical question: “Is there obvious static evidence that this model artefact or repository should not be trusted, loaded or run without further review?” It reduces practical risk from unsafe serialisation, suspicious repository content, weak provenance, exposed secrets, unexpected binaries, unsafe dependency files and malformed model metadata. It does not prove that a model is safe. A clean report means only that this tool did not identify the specific static artefact risks it is designed to detect. It must not be treated as proof that model weights, repository content, runtime behaviour or downstream use are benign. What the tool does what-the-tool-does What the tool does not do what-the-tool-does-not-do Architecture architecture Scanner coverage scanner-coverage Risk scoring risk-scoring Install install Quick start quick-start CLI reference cli-reference Example workflows example-workflows Reports and outputs reports-and-outputs Recommended operating model recommended-operating-model Development workflow development-workflow Testing and quality gates testing-and-quality-gates Repository structure repository-structure Security posture security-posture Standards alignment standards-alignment Limitations limitations Roadmap roadmap Contributing contributing Licence licence model-due-diligence statically inspects a local path and generates reviewable evidence. It checks: - file inventory, SHA-256 hashes, permissions and symlinks; - high-risk serialisation formats such as pickle, .pt , .pth , .bin , .joblib and H5; - lower-risk model formats such as .gguf , .safetensors and .onnx ; - GGUF magic bytes and version metadata; - safetensors header metadata; - suspicious text and binary strings; - Python AST indicators such as eval , exec , compile , pickle.loads , os.system and subprocess ; trust remote code=True usage in Python and text files;- risky pickle-like byte markers in high-risk serialisation formats; - high-entropy non-model files; - Git provenance, origin remote, current commit, dirty worktree and Git LFS listing where available; - external scanner output from ModelScan, Semgrep, Bandit, pip-audit and detect-secrets; - optional quality self-checks using Ruff, Pyright and mypy. The tool produces: - a human-readable Markdown report; - a deterministic JSON report for automation; - an optional SARIF report for code-scanning workflows; - raw external scanner outputs where external tools are run. The tool is intentionally static. During normal scanning it does not : - load model weights; - import untrusted repository code; - execute model-specific scripts; - run model inference; - send artefacts to external services; - require network access for local scanning; - decide automatically that a model is safe. Static scanning cannot reliably detect: - malicious behaviour encoded directly into model weights; - sleeper-agent or trigger-based backdoors; - training-data poisoning; - benchmark-specific manipulation; - malicious behaviour that appears only after fine-tuning; - malicious behaviour that appears only after tools are connected; - prompt-injection obedience in downstream RAG or agent workflows; - data exfiltration behaviour that only appears at runtime; - vulnerabilities in local model runtimes; - all unsafe deserialisation evasions. Use it as a risk-reduction gate , not as a trust oracle. The project uses a modular monolith architecture. This keeps installation and local execution simple while maintaining clear internal boundaries between CLI, orchestration, scanners, risk scoring and reports. php flowchart LR user User / CI -- cli CLI cli -- app Application Orchestrator app -- inventory File Inventory app -- native Native Static Scanners app -- external External Scanner Adapters app -- risk Risk Scorer risk -- report model Audit Report Model app -- report model report model -- markdown Markdown Report report model -- json JSON Report report model -- sarif SARIF Report native -- text Text Patterns native -- ast Python AST native -- binary Binary Strings native -- entropy Entropy native -- metadata Model Metadata native -- pickle Pickle Heuristics native -- git Git Provenance external -- modelscan ModelScan external -- semgrep Semgrep external -- bandit Bandit external -- pipaudit pip-audit external -- secrets detect-secrets external -- quality Quality Self-Checks sequenceDiagram participant U as User / CI participant C as CLI participant A as App participant I as Inventory participant N as Native Scanners participant E as External Scanners participant R as Risk Scorer participant W as Report Writers U- C: mdd