{"slug": "why-do-decision-trees-have-high-variance", "title": "Why Do Decision Trees Have High Variance?", "summary": "A developer explains that decision trees have high variance because a small change in training data can completely reshape the tree, altering the root feature, splits, and predictions. This sensitivity, not inaccuracy, is the source of high variance, which motivated ensemble methods like bagging and random forest.", "body_md": "Every Machine Learning course eventually says this:\n\n\"Decision Trees have high variance.\"\n\nWhen I first heard that, I accepted it and moved on.\n\nBut later, I stopped and asked myself a simple question:\n\nWhat does that actually mean?\n\nNot the textbook definition.\n\nWhat is the model really doing that makes everyone call it a \"high variance\" algorithm?\n\nThat question completely changed how I understood Decision Trees.\n\nSuppose you have a dataset with 10,000 customer records.\n\nYou train a Decision Tree.\n\nNow imagine removing just a few hundred records and training the model again.\n\nYou might expect the new tree to look almost identical.\n\nAfter all:\n\nSurprisingly, that's often **not** what happens.\n\nThe new tree may choose a different root feature.\n\nDifferent splits.\n\nDifferent branches.\n\nDifferent predictions.\n\nA tiny change in the training data can completely reshape the tree.\n\nThat isn't a bug.\n\nIt's the nature of Decision Trees.\n\nA Decision Tree builds itself one split at a time.\n\nAt every step, it asks:\n\n\"Which feature gives me the best split right now?\"\n\nSometimes two features are almost equally good.\n\nA small change in the training data can make Feature A slightly better than Feature B.\n\nOnce the root node changes, everything below it changes as well.\n\nIt's like taking a different road at the first intersection.\n\nEven though the destination is the same, the entire journey becomes different.\n\nOne small decision near the top creates a completely different tree.\n\nThink about a family tree.\n\nIf the first branch changes, every branch below it changes too.\n\nDecision Trees behave in a similar way.\n\nA different root node leads to different child nodes.\n\nDifferent child nodes lead to different grandchildren.\n\nOne early decision affects the entire structure.\n\nThat's why even a small change in the data can produce a very different model.\n\nImagine predicting whether a customer will buy a product.\n\nYou train one Decision Tree today.\n\nTomorrow, you collect a little more data and train it again.\n\nNow the predictions change noticeably.\n\nThe model isn't stable.\n\nIt reacts strongly to changes in the training data.\n\nThat instability is exactly what machine learning calls **high variance**.\n\nThe issue isn't that Decision Trees are inaccurate.\n\nThe issue is that they're sensitive.\n\nNot at all.\n\nDecision Trees are powerful because they can learn complex patterns without requiring feature scaling or linear relationships.\n\nThe trade-off is that this flexibility makes them more likely to overfit the training data.\n\nThey're excellent learners.\n\nSometimes they're just a little too eager to memorize.\n\nOnce I understood why Decision Trees have high variance, another question came to mind.\n\nIf the problem is instability, why not train many Decision Trees instead of trusting just one?\n\nThat simple question led me to Bagging and, eventually, Random Forest.\n\nAnd that's exactly where the next article begins.\n\nA Decision Tree has high variance not because it is a poor algorithm, but because it is highly sensitive to the data it learns from.\n\nEven a small change in the training data can produce a completely different tree.\n\nUnderstanding that single idea makes it much easier to understand why Bagging and Random Forest were created.", "url": "https://wpnews.pro/news/why-do-decision-trees-have-high-variance", "canonical_source": "https://dev.to/pavan_pothuganti/why-do-decision-trees-have-high-variance-lj8", "published_at": "2026-07-04 14:24:44+00:00", "updated_at": "2026-07-04 14:48:48.317771+00:00", "lang": "en", "topics": ["machine-learning", "artificial-intelligence"], "entities": ["Decision Trees", "Random Forest", "Bagging"], "alternates": {"html": "https://wpnews.pro/news/why-do-decision-trees-have-high-variance", "markdown": "https://wpnews.pro/news/why-do-decision-trees-have-high-variance.md", "text": "https://wpnews.pro/news/why-do-decision-trees-have-high-variance.txt", "jsonld": "https://wpnews.pro/news/why-do-decision-trees-have-high-variance.jsonld"}}