{"slug": "announcing-mutation-testing-in-haskell", "title": "Announcing Mutation Testing in Haskell", "summary": "Mutation testing is now generally available in the Haskell testing framework Sydtest, providing developers with an automated tool to verify test suite thoroughness by mutating code and checking whether tests detect the changes. The feature aims to address declining confidence in AI-generated code by establishing an objective, non-cheatable criterion for test coverage that operates independently of any project's subjective standards.", "body_md": "Mutation testing is now generally available in [sydtest](https://github.com/NorfairKing/sydtest). This is a major step towards a saner development workflow in the age of AI-generated code.\n\n## What is mutation testing?\n\nMutation testing aims to improve a test suite by automatically mutating code and asserting that the tests start failing.\n\nAlternatively:\n\nMutation testing is like a type-system for your tests. It asserts that the tests test the code thoroughly.\n\n### Example\n\nConsider this simple function:\n\n``` php\ncanCastFireball :: Int -> Int -> Bool\ncanCastFireball level mana =\n  level >= 5\n    && mana >= 10\n```\n\nwith a corresponding test suite:\n\n``` bash\nspec :: Spec\nspec = do\n  describe \"canCastFireball\" $ do\n    it \"allows powerful wizards\" $\n      canCastFireball 10 50 `shouldBe` True\n    it \"rejects exhausted powerful wizards\" $\n      canCastFireball 10 0 `shouldBe` False\n    it \"rejects weak wizards\" $\n      canCastFireball 1 10 `shouldBe` False\n```\n\nWould you say this is a good test suite for this code? How can you tell?\n\nWe could argue that a good test suite catches more of the mistakes you make.\n\nMutation testing consists of simulating making those mistakes and checking that the test suite would indeed catch the mistake.\n\nOn this example, it might generate a mutation like this:\n\n``` php\ncanCastFireball :: Int -> Int -> Bool\ncanCastFireball level mana =\n  level >= 5\n<     && mana >= 10\n---\n>     && mana > 10\n```\n\nWhen we run the same test suite again, all of the tests still pass. This means that if you had made this exact mistake, your tests wouldn't have caught that. It is called a **surviving mutation** and it is *undesired*.\n\nWhen a mutation survives, you can add a test to **cover** it. For example, this test could cover it:\n\n``` bash\nspec :: Spec\nspec = do\n  describe \"canCastFireball\" $ do\n    it \"allows barely-energetic wizards\" $\n      canCastFireball 8 10 `shouldBe` True\n```\n\nNow when we run the test suite on the mutated code, this test will fail. This means that if you had made this exact mistake, the (new) test suite would have caught that. It is called a **killed mutation** and it is *desired*. (Don't get me started on how confusing and violent this terminology is.)\n\nA mutation testing engine automatically generates mutations and runs (corresponding) tests. Ideally it would generate many mutations of which none survive.\n\nFor maximum assurance, you would cover every mutation. Realistically you would disable some.\n\n## Why start mutation testing now?\n\nI've been using a coding agent (Claude) for a while now and noticed that I have ever less confidence in the code it produces. This is not necessarily related to it being less intelligent than I am (it's often not), but rather to the sheer volume of code it can produce in the same amount of time.\n\nI have good instructions in place to have it write tests, regression tests, and property tests, but often it completely ignores my instructions or writes useless tests. The only thing that really saves me is a [non-AI-based CI system](https://nix-ci.com) that tells me when any of my checks fail.\n\nSo my aim was to produce a check that would fail if a change were insufficiently tested, without relying on any subjective criterion for determining what \"sufficient testing\" means. Mutation testing lets me have a **completely objective** criterion that is **independent of my project** defined in another repository so that my agent **cannot cheat**.\n\n## How can I try it?\n\nMutation testing is now officially available as a part of [Sydtest](https://github.com/NorfairKing/sydtest).\n\n### Nix Check\n\nYou can add a mutation check to your `flake.nix`\n\n's `checks`\n\nlike this:\n\n```\n  checks.x86_64-linux.mutation = pkgs.haskellPackages.sydtest.mutationCheck {\n    name = \"my-mutation-check\";\n    packages = [\n      \"my-package\"\n      \"my-other-package\"\n    ];\n  };\n```\n\nSydtest takes care of the rest and produces nice reports. Both human-readable...\n\nand machine-readable:\n\n```\n{\n  \"outcome\": \"uncovered\",\n  \"mutation\": {\n    \"id\": [\"Money.Amount\", \"Cmp\", \"801\", \"79\", \"92\", \"<\", \"1\" ],\n    \"operator\": \"Cmp\",\n    \"original\": \">\",\n    \"replacement\": \"<\",\n    \"module\": \"Money.Amount\",\n    \"source_file\": \"src/Money/Amount.hs\",\n    \"line\": 801,\n    \"end_line\": 801,\n    \"col_start\": 79,\n    \"col_end\": 92,\n    \"context_before\": [\n      \"\",\n      \"-- | Validate that an 'Amount' is strictly positive. I.e. not 'zero'.\",\n      \"validateStrictlyPositive :: Amount -> Validation\"\n    ],\n    \"source_lines\": [\n      \"validateStrictlyPositive amount = declare \\\"The Amount is strictly positive\\\" $ amount > zero\"\n    ],\n    \"mutated_lines\": [\n      \"validateStrictlyPositive amount = declare \\\"The Amount is strictly positive\\\" $ amount < zero\"\n    ],\n    \"context_after\": [],\n    \"covering_tests\": {\n      \"really-safe-money-autodocodec-test\": [],\n      \"really-safe-money-test\": []\n    },\n    \"timeout_micros\": 30000000\n  }\n}\n```\n\n### Disabling mutations\n\nSometimes you don't care whether a piece of code is fully mutation tested. A good example (in my opinion) is debug logging:\n\n```\ndoAThing = do\n    logDebug \"Doing a thing\"\n    doTheThing\n```\n\nRemoving the `logDebug`\n\nline is a valid mutation, but I just don't care to test it.\n\nIn this case I can add an annotation:\n\n```\n{-# ANN doAThing (\"DisableMutationsFor logDebug\" :: String) #-}\ndoAThing = do\n    logDebug \"Doing a thing\"\n    doTheThing\n```\n\nThere are other annotations available to disable mutations per-module, per-mutation, or per-binding.\n\n## Conclusion\n\nMutation testing in Haskell is ready to try out. I'm already using it in [NixCI](https://nix-ci.com) and the latest version of [ really-safe-money](https://github.com/NorfairKing/really-safe-money/tree/master) is already fully mutation tested.\n\nPlease let me know if you end up trying it. I'd love to nerd out about this.", "url": "https://wpnews.pro/news/announcing-mutation-testing-in-haskell", "canonical_source": "https://cs-syd.eu/posts/2026-06-03-mutation-testing-in-haskell", "published_at": "2026-06-04 04:28:44+00:00", "updated_at": "2026-06-04 05:17:19.054467+00:00", "lang": "en", "topics": ["ai-tools", "ai-research"], "entities": ["sydtest", "NorfairKing"], "alternates": {"html": "https://wpnews.pro/news/announcing-mutation-testing-in-haskell", "markdown": "https://wpnews.pro/news/announcing-mutation-testing-in-haskell.md", "text": "https://wpnews.pro/news/announcing-mutation-testing-in-haskell.txt", "jsonld": "https://wpnews.pro/news/announcing-mutation-testing-in-haskell.jsonld"}}