{"slug": "closing-the-feedback-loop-how-mistake-classification-drives-adaptive-problem-in", "title": "Closing the feedback loop: how mistake classification drives adaptive problem selection in NumPath", "summary": "The NumPath AI math tutor for children with dyscalculia fixed a flaw in its adaptive engine: although the system classified student mistakes (e.g., digit reversal, borrowing errors), it ignored these classifications when selecting the next problem. The solution was a 60-line code change that implements a rule-based system: if a specific mistake type appears at least twice in the last three attempts, the system reduces the difficulty of the associated knowledge component. This change closes the feedback loop between error diagnosis and problem selection, making the system a true Intelligent Tutoring System rather than a simple difficulty slider.", "body_md": "NumPath is an AI math tutor for children with dyscalculia. At its core is an adaptive engine that picks the next problem for each student based on their Bayesian Knowledge Tracing (BKT) mastery estimate. In this post I'll walk through a problem we had — and solved — in the rule-based phase: classified mistakes were being logged but completely ignored by the selection engine.\nThe fix was a 60-line change across two files. The research implication is significant.\nOur MistakeClassifier\nalready tagged every wrong answer with a structured code — BORROW_SKIP\nwhen a student adds instead of subtracts with borrowing, DIGIT_REVERSAL\nwhen they write 51 for 15, MAGNITUDE_MISJUDGE\nwhen they pick the smaller number as larger. These MistakeEvent\nrecords were hitting the database on every incorrect attempt.\nBut GetNextProblemUseCase\n— the code that decides what problem a student gets next — never read them. The engine was selecting problems purely on BKT p_mastery\n. A student could hit BORROW_SKIP\nthree sessions in a row and still receive problems at the same difficulty, on the same skill, with zero response to the pattern.\nThis violates what MacLellan et al. call the \"Error as Diagnostic Signal\" principle: mistakes should trigger targeted remediation, not generic retry.\nThe core question was: when should a mistake pattern trigger a response, and what should that response be?\nWe settled on three rules, each encoded as a named constant:\nMISTAKE_WINDOW = 3 # look back this many MistakeEvents\n# threshold = ceil(MISTAKE_WINDOW / 2) = 2 — dominant code must appear ≥ 2× in window\nMISTAKE_KC_MAP = {\n\"DIGIT_REVERSAL\": \"PLACE_VALUE\",\n\"BORROW_SKIP\": \"SUB_BORROW\",\n\"MAGNITUDE_MISJUDGE\": \"PLACE_VALUE\",\n\"PLACE_VALUE_CONFUSION\": \"PLACE_VALUE\",\n\"OPERATION_CONFUSION\": \"OPERATION_SIGN\",\n}\nWhen _detect_mistake_signal()\nfires, two things happen:\np_mastery\n.DIFFICULTY_STEP\n(0.2) down, floored at ENTRY_DIFFICULTY\n(0.3) to prevent over-scaffolding students who are already at entry level.What we explicitly rejected: resetting difficulty to zero (too harsh for students who've been making progress), and weighting by mistake severity (too complex for Phase 1 with no real data to calibrate against).\nThe reason\nfield on every NextProblemResponse\nnow names the triggering pattern:\n\"Remediation: BORROW_SKIP detected 2× on SUB_BORROW (p_mastery=0.41)\"\nThis is the explainability requirement. A teacher looking at this in the dashboard can understand exactly why the system chose what it did.\nThe central claim of NumPath's RCT will be that adaptive, mistake-aware tutoring produces better outcomes than static worksheets for dyscalculic learners. Before this change, we had a system that adapted difficulty based on streaks but ignored the type of error a student was making. That's not meaningfully different from a worksheet that repeats problems when you get them wrong.\nClosing this loop — mistake code → KC target → difficulty adjustment → reason\nfield — is what makes the system an Intelligent Tutoring System rather than a difficulty slider. Every MistakeEvent\nrecord is now a longitudinal data point that shapes the student's next experience, and that chain of causality is fully traceable.\nThe implementation was straightforward. The harder question was the threshold: why 2 of 3, not 3 of 3? Three-of-three is too strict — a student who makes BORROW_SKIP\n, then DIGIT_REVERSAL\n, then BORROW_SKIP\nagain has a clear pattern but the strict threshold misses it. Two-of-three catches the pattern earlier at the cost of occasional false positives. We don't yet have real student data to validate this choice — it's a hypothesis. We've logged it as a research note for Phase 4.\nThe one thing I'd do differently: add the MistakeEvent\nindex to the model on day one. It was missing and only caught during the performance review pass. A composite index on (student_id, created_at)\nis obvious in hindsight for any table you're going to query with ORDER BY created_at DESC LIMIT N\n.\nNext up: wiring the KC states into the teacher dashboard so educators can see p_mastery per student, not just 7-day accuracy — the final piece of the MacLellan \"Teacher-in-the-Loop\" principle.\nMistakeEvent\ninto select_next_problem()\nis a 60-line change with a meaningful research impactreason\nfield is not a nice-to-have — every adaptive decision must be explainable to a teacher; string-formatted rationale on each NextProblemResponse\nis the minimum viable explainabilityMISTAKE_WINDOW\n, FRUSTRATION_WINDOW\n, MASTERY_WINDOW\nsit side by side; when we have real data to calibrate thresholds, we change one line each", "url": "https://wpnews.pro/news/closing-the-feedback-loop-how-mistake-classification-drives-adaptive-problem-in", "canonical_source": "https://dev.to/orieken/closing-the-feedback-loop-how-mistake-classification-drives-adaptive-problem-selection-in-numpath-5ce9", "published_at": "2026-05-21 03:50:28+00:00", "updated_at": "2026-05-21 04:02:04.863732+00:00", "lang": "en", "topics": ["artificial-intelligence", "machine-learning", "research", "products", "data"], "entities": ["NumPath", "Bayesian Knowledge Tracing", "MistakeClassifier", "MistakeEvent", "GetNextProblemUseCase", "MacLellan"], "alternates": {"html": "https://wpnews.pro/news/closing-the-feedback-loop-how-mistake-classification-drives-adaptive-problem-in", "markdown": "https://wpnews.pro/news/closing-the-feedback-loop-how-mistake-classification-drives-adaptive-problem-in.md", "text": "https://wpnews.pro/news/closing-the-feedback-loop-how-mistake-classification-drives-adaptive-problem-in.txt", "jsonld": "https://wpnews.pro/news/closing-the-feedback-loop-how-mistake-classification-drives-adaptive-problem-in.jsonld"}}