Does DSPy prompt optimization weaken adversarial robustness? A new benchmark, dspy-security-bench, reveals that DSPy prompt optimization degrades adversarial robustness against harder prompt-injection attacks. Testing with AgentDojo's attack suite, optimizers like BootstrapFewShot and MIPROv2 improved utility on direct attacks but reduced security on important_instructions attacks, with BootstrapFewShot Pareto-dominating MIPROv2 at single-seed scale. Measure how DSPy prompt optimization affects the prompt-injection robustness of agentic LLM programs, using AgentDojo's https://github.com/ethz-spylab/agentdojo attack suite as ground truth. The question: when you optimize a DSPy program with BootstrapFewShot , MIPROv2 , or GEPA , does it become more or less robust to prompt-injection attacks? Two adjacent research communities — prompt optimization and prompt-injection security — have not measured this intersection. dspy-security-bench wires DSPy optimizers and AgentDojo attacks into one harness so the trade-off becomes visible. Update 2026-06-26 : a 3-seed sanity check changes the optimizer ordering shown here.The numbers below are the single-seed seed=0 result. Aggregated over three seeds, BootstrapFewShot is actually theloweston important instructions security 0.600 , and MIPROv2 and GEPA tie at 0.733. Standard deviations at N=5 user tasks land in the 0.4 to 0.5 range, so individual rankings here are dominated by noise. What survives across seeds: BootstrapFewShot 's direct -attack Pareto win, the unoptimized 0% utility floor, and the qualitative "optimization trends below unoptimized on the harder attack" pattern. Full 3-seed numbers: . v0.2 phase 2 will scale N to put any optimizer-ranking claim on solid statistical ground. data/results/workspace v02 phase1 seeds summary.csv Headline seed=0 :prompt optimization measurably degrades adversarial robustness on harder attacks.Optimizers buy utility 0% → 40-60% task success on direct but pay it back in security on important instructions 80% → 60% attack-failure rate . BootstrapFewShot Pareto-dominates MIPROv2 on the workspace suite at v0.1's single-seed scale. See update note above for what holds vs. what does not when averaged across 3 seeds. | Optimizer | Attack | Utility | Security | Injection success | n | |---|---|---|---|---|---| unoptimized | direct | 0% | 100% | 0% | 5 | unoptimized | important instructions | 0% | 80% | 20% | 5 | bootstrap fewshot | direct | 60% | 100% | 0% | 5 | bootstrap fewshot | important instructions | 20% | 60% | 40% | 5 | miprov2 | direct | 40% | 80% | 20% | 5 | miprov2 | important instructions | 20% | 60% | 40% | 5 | Reading the chart. A point closer to the green star top-right is the ideal — high utility and high security. Three patterns hold across this scale: It refuses to do the task 0% utility regardless of attack, and resists attacks at 80–100%. unoptimized is high-security but useless.Equal or highest utility 60% on bootstrap fewshot is the best operating point at this scale. direct , equal-best security on direct 100% , and matches miprov2 's degraded important instructions security.Lower utility on miprov2 Pareto-loses to bootstrap. direct 40% vs 60% AND lower security 80% vs 100% . Suggests heavier optimization overfits the clean-distribution prompt and exposes more attack surface. v0.1 scope: workspace suite only, N=5 user tasks × 1 injection task × 2 attacks × 3 optimizers = 30 runs. gpt-4o-mini for execution + judge. Trainset = 192 validated synthetic tasks 100 gpt-4o + 100 claude-sonnet, validated syntactic + dedupe . See for reproduction. scripts/run v01 benchmark.py php flowchart TD A AgentDojo seed env data -- B env-data extractor B -- C synthesis generator