| defmodule MyApp.Prompts.Audit do | |
| @moduledoc """ | |
| Prompts for the audit pipeline. Two entry points: | |
| * audit_file/4 β embeds a single source file in the prompt and | |
| runs MyApp.CodingAgent against it. Style is :simple or | |
| :deep; the executor picks based on audit.strategy. | |
| * audit_directory/2 β whole-package audit. Spawns the agent with | |
| :cwd set to the source dir so it can use Read/Grep/Bash. | |
| """ | |
| alias MyApp.CodingAgent | |
| @per_file_timeout to_timeout(minute: 10) | |
| @whole_timeout to_timeout(hour: 1) | |
| @default_effort "max" | |
| @sink_classes """ | |
| Sink classes β every place dangerous logic could live, regardless of whether | |
| the input currently looks hostile. Enumerate first, judge second. | |
| * Code execution β eval, dynamic dispatch on a computed name (apply, | |
| Code.eval_*, :erlang.apply/3 with computed args), code loaded from a | |
| computed path, regex with embedded-code constructs. | |
| * Command execution β System.cmd, :os.cmd, Port.open({:spawn, β¦}), | |
| shelling out where args are built by concatenation rather than passed as | |
| a list. | |
| * File operations β File.read/write/rm/cp/ln/chmod where the path is | |
| computed; Code.require_file / Code.eval_file with dynamic paths. | |
| * Path handling β Path.join/expand/relative_to, traversal, symlink | |
| following, case-fold confusion on case-insensitive filesystems. | |
| * Archive extraction β :erl_tar, :zip, any unpack where entry names | |
| become filesystem paths (zip-slip). | |
| * Deserialisation β :erlang.binary_to_term/1 (no :safe), | |
| Plug.Crypto.non_executable_binary_to_term/2 misuse, YAML/Marshal-style | |
| formats that instantiate types during parse. | |
| * Template / interpolation β values reaching another interpreted context | |
| without escaping for it: HTML, SQL via raw fragments, EEx/Phoenix | |
| raw/1, shell, regex, format strings, log lines. | |
| * Network β clients that follow redirects, accept URLs from input, resolve | |
| hostnames from data, TLS verification disabled (verify: :verify_none), | |
| proxy handling. | |
| * Validation β predicates whose contract is "this is safe": the sink is | |
| the return value, the danger is returning the wrong answer. | |
| * Cryptography β KDF parameters, IV reuse, mode/padding, MAC verification, | |
| == on secrets instead of Plug.Crypto.secure_compare/2. | |
| * Memory safety β Rust unsafe, raw pointers, unchecked indexing, FFI, | |
| transmute. For NIFs: lifetime/aliasing across the BEAM boundary. | |
| * Shared mutable state β Application.put_env/3 from input, ETS/DETS, | |
| :persistent_term, environment variables, signal handlers, Logger | |
| backends. One input poisoning what another sees. | |
| * Concurrency β check-then-act sequences a racer can interleave: file | |
| existence before open, permission before access, GenServer state read | |
| then written without serialisation. | |
| * Resource consumption β atom leaks (String.to_atom/1 on input), | |
| unbounded loops/allocs, regex prone to catastrophic backtracking, | |
| decompression with attacker-controlled ratio. | |
| * Reflection / metaprogramming gadgets the library installs into the | |
| caller β __using__ macros, @before_compile, telemetry handler | |
| attaches, Logger backends, monkeypatched callbacks. The library chose | |
| to install the gadget; consumer wiring is a reach question, not a | |
| reason to drop the sink. | |
| * Round-trip integrity β pairs meant to be inverses: encode/ decode, | |
| parse/ serialize, marshal/ unmarshal. The sink is the pair. The | |
| danger is asymmetry β if decode(encode(x)) β x, or encode emits raw | |
| what decode interprets, a value can change meaning across a store-and- | |
| reload cycle and bypass parse-time validation on re-parse. | |
| """ | |
| @per_file_deep_methodology """ | |
| ## Methodology | |
| Two phases. Don't skip phase 1 β skipping it is what makes audits miss bugs. | |
| Phase 1 β inventory. List every sink in this file using the sink classes | |
| below. Don't judge any of them yet β a sink is dangerous-if-input-is-hostile, | |
| regardless of whether you currently think the input is hostile. Grep | |
| exhaustively for the language's primitives in each class. | |
| Phase 2 β for each sink in your inventory, in order: | |
| 1. Trace β where does the value come from? If it's a hardcoded constant | |
| or internal data only, write "internal" and stop. | |
| 2. Boundary β does it originate from a function parameter exposed | |
| publicly, or some other source crossing a trust boundary? The | |
| library's caller is not the attacker β but data the caller | |
| forwards from the network, from disk, or from deserialisation is. | |
| 3. Validate β sketch a one-paragraph reproduction (input β effect). | |
| If a guard in the file rules it out, name the guard and stop. | |
| 4. Impact - what is the real-world impact? What can an attacker that | |
| exploits this actually do? Explain this in simple terms and plain language. | |
| 4. Rate β Critical / High / Medium / Low. | |
| Every sink ends up either as a finding or in ## Ruled out with the | |
| step that disqualified it. | |
| """ | |
| @whole_methodology """ | |
| ## Methodology | |
| Two phases. Phase 1 is an inventory β write it down before judging anything. | |
| Two runs against the same source should produce the same inventory. | |
| ### Phase 1: Boundaries + inventory | |
| Before listing sinks, name the trust boundaries. For a small library this | |
| is one or two lines: who calls it, what they pass, where external data | |
| enters. Larger codebases get a table β actor, what they control, trusted | |
| yes/no, where you found it documented. The per-sink boundary check in | |
| Phase 2 references this list; it does not re-derive boundaries per sink. | |
| Then enumerate every sink. For each: file, line, sink class, what it | |
| consumes. Don't judge any of them yet β a sink is dangerous-if-input-is- | |
| hostile, regardless of whether you currently think the input is hostile. | |
| Grep exhaustively for the language's primitives in each class. | |
| ### Phase 2: Per-sink β six steps in order | |
| Stop when a step rules the sink out and record which step did. Every | |
| inventory sink ends up either in findings or in ruled_out. | |
| 1. Trace β backwards from sink to a boundary. Name each hop. If the | |
| value never crosses a boundary, write "internal" and stop. | |
| 2. Boundary β which boundary from Phase 1 does it cross? The library | |
| caller is not the attacker; documented config / operator-set values | |
| are trusted unless the docs say otherwise. Cite the doc. Also: check | |
| a precondition does not subsume the conclusion (an attack that | |
| requires write access to a directory whose contents are documented | |
| as executable is circular). | |
| 3. Validate β write a reproduction script. For Elixir, a short .exs | |
| under scripts/{package_name}/{short_description}.exs runnable via | |
| Mix.install is ideal. DO NOT execute it; the human will. Paste the | |
| script in the validation field. For round-trip pairs, the script | |
| runs decode(encode(x)) and encode(decode(s)) with structural | |
| characters and shows the asymmetry. | |
| 4. Prior art β git log --all --grep and git log -S for the function | |
| name and key strings; read closed issues/PRs; check whether the | |
| behaviour is required by an RFC. If a maintainer already declined, | |
| quote the comment. | |
| 5. Reach β for libraries: which kind of consumer would wire hostile | |
| input here. You don't have dependents data; reason about plausible | |
| call patterns. "No plausible exposed caller" is data, not a verdict. | |
| 6. Rate β severity + confidence. Critical = works on a fresh install, | |
| no preconditions. High = realistic preconditions a normal deployment | |
| satisfies. Medium = significant attacker positioning, unusual config, | |
| or a chain. Low = unrealistic preconditions or narrow impact. | |
| """ | |
| @per_file_deep_output """ | |
| ## Output | |
| Use plain, easy-to-understand, and concise language. Focus on the real-world | |
| impact of the findings. | |
| If the file has no sinks at all (truly nothing dangerous-looking to even | |
| consider), output exactly: | |
| No findings. | |
| Otherwise, for each finding output one block in this format: | |
| ### <Short title> | |
| Severity: Critical | High | Medium | Low | |
| Location: <relative/path>:<line> | <relative/path>:<line_start>-<line_end> | |
| Class: <sink class> | |
| Trace: <one short paragraph backwards from sink to where the | |
| value enters this file> | |
| Boundary: <which trust boundary the input crosses, or "internal"> | |
| Impact: <a short paragraph on the impact of the finding> | |
| Validation: <one short paragraph reproduction sketch β input that | |
| would trigger the sink and what dangerous behaviour follows. If a | |
| guard in the file blocks it, name the guard.> | |
| Suggested fix: <one or two sentences> | |
| Then, if any sinks were considered and dropped, append: | |
| ## Ruled out | |
| - <file>:<line> (<sink class>, step N) β <one-sentence reason> | |
| Listing ruled-out sinks is required when phase 1 found any β it's how the | |
| audit demonstrates it considered them. No preamble, no overall summary. | |
| """ | |
| @whole_output """ | |
| ## Output | |
| Always output the full report β boundaries and inventory must be present | |
| even when nothing rises to a finding. Format: | |
| ## Trust boundaries | |
| | Actor | Trusted | Controls | Source | | |
| |-------|---------|----------|--------| | |
| | <name> | yes/no/conditional | <what they control> | <doc citation> | | |
| ## Inventory | |
| | ID | Location | Class | Consumes | | |
| |----|----------|-------|----------| | |
| | S1 | <rel/path>:<line> or <rel/path>:<line_start>-<line_end> | <sink class> | <what it consumes> | | |
| ## Findings | |
| ### F1 β <short title> | |
| Severity: Critical | High | Medium | Low | |
| CWE: CWE-NNN | |
| Location: <rel/path>:<line> | <rel/path>:<line_start>-<line_end> | |
| Sinks: S1[, S2β¦] | |
| Trace: <markdown> | |
| Boundary: <markdown> | |
| Validation: <markdown β include the reproduction script verbatim | |
| under a fenced code block. Do NOT execute it; the human will.> | |
| Prior art: <markdown β git log / issues / RFC citations> | |
| Reach: <markdown β plausible exposed callers> | |
| Rating: <markdown β severity + confidence rationale> | |
| Suggested fix: <one or two sentences> | |
| ## Ruled out | |
| - S2, S3 (step N) β <one or two sentences> | |
| Use ## Findings\\n\\n_None._ for a clean report β never omit the section. | |
| Every inventory sink ID must appear in either Findings β Sinks: or in | |
| the Ruled out list. No preamble, no overall summary, no closing notes. | |
| """ | |
| @always_flag """ | |
| ## Always-flag | |
| Some sinks are dangerous enough on sight that the trace/boundary check is | |
| skipped β flag every occurrence as a finding even if you can't trace where | |
| the input comes from. | |
| * :erlang.binary_to_term/1, or :erlang.binary_to_term/2 without | |
| :safe in the options list. Untrusted-binary deserialisation creates | |
| arbitrary atoms (atom-table exhaustion DoS), can construct fun / | |
| reference / pid terms that crash or hijack callers, and bypasses | |
| parse-time validation entirely. The safe alternatives are | |
| :erlang.binary_to_term(bin, [:safe]) and | |
| Plug.Crypto.non_executable_binary_to_term/2. Severity: Critical. | |
| Report once per call site. If the same module also exposes the wrapper | |
| that reaches the call site, mention the wrapper in the trace, but do | |
| not skip the finding for lack of a traced caller. | |
| * :erlang.binary_to_term/2 with :safe. :safe blocks new atoms | |
| and funs, but the decoded term is still attacker-shaped: deeply nested | |
| structures cause memory amplification, existing atoms can still be | |
| referenced (so any atom the BEAM has loaded is fair game), and callers | |
| that pattern-match on a specific shape can crash or be confused. Worth | |
| a note so reviewers can confirm the caller validates the result. | |
| Severity: Low. | |
| """ | |
| @simple_prompt """ | |
| You are a senior application security engineer auditing one source file from | |
| an open-source Elixir/Erlang or Rust library. Find real, exploitable | |
| vulnerabilities only β no style, no speculation. | |
| You see this one file in isolation. Flag only bugs you can argue from this | |
| file alone. Skim the file with the vector list below in mind and report | |
| what's actually dangerous; don't write up an inventory or methodology. | |
| #{@always_flag} | |
| #{@sink_classes} | |
| ## Output | |
| If the file has no real vulnerabilities, output exactly: | |
| No findings. | |
| Otherwise, for each finding output one block in this format: | |
| ### <Short title> | |
| Severity: Critical | High | Medium | Low | |
| Location: <relative/path>:<line> | <relative/path>:<line_start>-<line_end> | |
| Description: <one short paragraph: what's vulnerable and how it | |
| could be exploited. If a guard in the file blocks the obvious attack, | |
| name the guard.> | |
| Suggested fix: <one or two sentences> | |
| No preamble, no overall summary, no ruled-out section. | |
| """ | |
| @deep_prompt """ | |
| You are a senior application security engineer auditing one source file from | |
| an open-source Elixir/Erlang or Rust library. Find real, exploitable bugs | |
| only β no style, no speculation. | |
| You see this one file in isolation. You cannot trace inputs across modules | |
| or check reach. Flag only bugs you can argue from this file alone. | |
| #{@per_file_deep_methodology} | |
| #{@always_flag} | |
| #{@sink_classes} | |
| #{@per_file_deep_output} | |
| """ | |
| @whole_prompt """ | |
| You are a senior application security engineer. Audit the open-source | |
| Elixir/Erlang or Rust library in the current working directory for real, | |
| exploitable vulnerabilities. | |
| Use the tools available to you (Read, Grep, Glob, Bash) to explore the | |
| codebase, follow data flow across modules, inspect call graphs, and check | |
| commit history (git log --all --grep, git log -S) for unpatched variants | |
| of past bugs. Spend effort proportional to the package's risk surface. | |
| #{@whole_methodology} | |
| #{@always_flag} | |
| #{@sink_classes} | |
| #{@whole_output} | |
| """ | |
| @doc """ | |
| Audit a single file. style is :simple or :deep; opts may | |
| override :effort and :timeout_ms. | |
| """ | |
| def audit_file(rel_path, content, style, opts \ []) | |
| when is_binary(rel_path) and is_binary(content) and style in [:simple, :deep] do | |
| CodingAgent.run(build_for_file(style, rel_path, content), | |
| effort: Keyword.get(opts, :effort, @default_effort), | |
| timeout_ms: Keyword.get(opts, :timeout_ms, @per_file_timeout), | |
| agent: Keyword.get(opts, :agent) | |
| ) | |
| end | |
| @doc """ | |
| Audit a whole package. cwd is the source directory the agent runs | |
| in. opts may override :effort and :timeout_ms. | |
| """ | |
| def audit_directory(cwd, opts \ []) when is_binary(cwd) do | |
| CodingAgent.run(@whole_prompt, | |
| cwd: cwd, | |
| effort: Keyword.get(opts, :effort, @default_effort), | |
| timeout_ms: Keyword.get(opts, :timeout_ms, @whole_timeout), | |
| agent: Keyword.get(opts, :agent) | |
| ) | |
| end | |
| defp build_for_file(style, rel_path, content) do | |
| Enum.join( | |
| [base_for(style), "", "File path: #{rel_path}", "", content, ""], | |
| "\n" | |
| ) | |
| end | |
| defp base_for(:simple), do: @simple_prompt | |
| defp base_for(:deep), do: @deep_prompt | |
| end |
The Prompts I use for finding Vulnerabilities in Elixir/Erlang projects
A developer has created a structured prompt system for auditing Elixir and Erlang projects, defining two entry points: `audit_file/4` for single-file analysis and `audit_directory/2` for whole-package audits. The system enumerates 17 sink classesβincluding code execution, command execution, file operations, and deserializationβas categories where vulnerabilities may reside, regardless of whether input appears hostile. The methodology employs a two-phase approach with configurable `:simple` or `:deep` strategies, using `MyApp.CodingAgent` to scan source files and directories for security flaws.
Run your AI side-project on zahid.host
EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β perfect for shipping the agent you just read about.