{"slug": "notes-on-adversarial-paraphrasing-a-paper-review", "title": "Notes on adversarial paraphrasing: a paper review", "summary": "A paper by Saha et al. (arXiv 2506.07001) demonstrates that detector-guided paraphrasing using RoBERTa as a reward reduces the true positive rate of AI-generated text detectors by 87.88% across Binoculars, Fast-DetectGPT, Ghostbuster, RADAR, and GPTZero. The approach is universal and training-free, and it remains effective even against detectors trained with adversarial examples, suggesting the discriminator signal is narrower than the generator space.", "body_md": "Just finished reading Saha et al. arXiv 2506.07001 on adversarial paraphrasing for AI detector evasion.\n\nKey claim: detector-guided paraphrasing with RoBERTa as reward reduces TPR by 87.88 percent across Binoculars, Fast-DetectGPT, Ghostbuster, RADAR, GPTZero. Universal, training-free.\n\nWhat surprised me: the approach works even on detectors that were trained with adversarial examples baked in. Suggests the discriminator signal is fundamentally narrower than the generator space.\n\nOpen questions:", "url": "https://wpnews.pro/news/notes-on-adversarial-paraphrasing-a-paper-review", "canonical_source": "https://dev.to/deitch83919/notes-on-adversarial-paraphrasing-a-paper-review-214o", "published_at": "2026-06-24 03:24:40+00:00", "updated_at": "2026-06-24 03:43:46.426402+00:00", "lang": "en", "topics": ["large-language-models", "natural-language-processing", "ai-safety", "ai-research"], "entities": ["Saha et al.", "RoBERTa", "Binoculars", "Fast-DetectGPT", "Ghostbuster", "RADAR", "GPTZero"], "alternates": {"html": "https://wpnews.pro/news/notes-on-adversarial-paraphrasing-a-paper-review", "markdown": "https://wpnews.pro/news/notes-on-adversarial-paraphrasing-a-paper-review.md", "text": "https://wpnews.pro/news/notes-on-adversarial-paraphrasing-a-paper-review.txt", "jsonld": "https://wpnews.pro/news/notes-on-adversarial-paraphrasing-a-paper-review.jsonld"}}