cd /news/natural-language-processing/redact-or-keep-a-fully-local-ai-casc… · home topics natural-language-processing article
[ARTICLE · art-32069] src=arxiv.org ↗ pub= topic=natural-language-processing verified=true sentiment=↑ positive

Redact or Keep? A Fully Local AI Cascade for Educational Dialogue De-Identification

Researchers propose a fully local AI cascade for de-identifying educational dialogue that achieves 0.958 macro F1 on math tutoring transcripts, outperforming a commercial API (0.706) and LLM-only baselines (0.767), while running entirely on a single laptop. The system reframes de-identification as constrained privacy triage, using a recall-first proposer and context-aware reviewer to distinguish personal names from curricular terms like 'Riemann'.

read1 min views2 publishedJun 18, 2026

arXiv:2606.18372v1 Announce Type: new Abstract: Educational dialogue is a valuable but sensitive resource for research: the same transcripts that capture authentic learning often capture personally identifiable information (PII) entangled with curricular content, where "Riemann" may refer to a real student or to a mathematical concept. Existing approaches force a tradeoff between governance and accuracy. Commercial Large Language Models (LLMs) can handle this ambiguity but require sending student data to third parties, while local named entity recognition (NER) systems preserve governance but over-redact curricular terms. We propose a fully local cascade framework that reframes de-identification from open-ended entity recognition to constrained privacy triage. A recall-first union proposer combines two lightweight encoders with deterministic rules to over-generate candidate spans; a context-aware reviewer then makes a binary Redact/Keep decision for each candidate using surrounding dialogue and speaker role. We evaluate three reviewer configurations against same-family LLM-only baselines and a commercial API on math tutoring transcripts from two large platforms. The strongest local configuration reaches 0.958 macro F1, compared with 0.767 for a same-family LLM-only baseline and 0.706 for the commercial API, while running entirely on a single laptop. On a targeted challenge set of curricular-personal name ambiguity, the same configuration degrades by only 0.03 F1 versus 0.19 to 0.25 for smaller reviewers. These results suggest that for educational de-identification, problem formulation matters more than model scale.

── more in #natural-language-processing 4 stories · sorted by recency
── more on @arxiv 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/redact-or-keep-a-ful…] indexed:0 read:1min 2026-06-18 ·