22:12
2026-06-24
arxiv.org
large-language-models
LLMs use "safety" specific neuron layers to identify vulnerabilities in code
Researchers at an undisclosed institution analyzed Gemma-2-2b's internal circuits and found that the LLM detects vulnerable code by relying on safety detectors that recognize safe coding patterns, ratβ¦