# Notes on adversarial paraphrasing: a paper review

> Source: <https://dev.to/deitch83919/notes-on-adversarial-paraphrasing-a-paper-review-214o>
> Published: 2026-06-24 03:24:40+00:00

Just finished reading Saha et al. arXiv 2506.07001 on adversarial paraphrasing for AI detector evasion.

Key claim: detector-guided paraphrasing with RoBERTa as reward reduces TPR by 87.88 percent across Binoculars, Fast-DetectGPT, Ghostbuster, RADAR, GPTZero. Universal, training-free.

What surprised me: the approach works even on detectors that were trained with adversarial examples baked in. Suggests the discriminator signal is fundamentally narrower than the generator space.

Open questions: