{"slug": "hunting-digital-chameleons-how-we-defeated-botnets-in-laravel-v2-4-0", "title": "Hunting Digital Chameleons: How We Defeated Botnets in Laravel v2.4.0", "summary": "The developer of the VisitAnalytics package for Laravel v2.4.0 implemented a bot detection system that analyzes Client Hints headers (Sec-CH-*) to distinguish human users from sophisticated bots. By detecting missing or logically inconsistent headers, the system assigns a bot score and flags non-human traffic, leading to a major architectural refactoring.", "body_md": "In the world of web traffic, there’s a simple rule: if it looks like a regular user, walks like a user, and even brings its favorite cookies along—it doesn't always mean there’s a human on the other side. Sometimes, it’s just a very diligent bot that happened to read the `User-Agent`\n\ndocumentation yesterday.\n\nIn this article, we’ll share how our traffic analysis tool evolved from naive trust in headers to a paranoid level of verification, and how that led to a \"spring cleaning\" of our architecture.\n\n(For more on the project's first deep refactoring, read our article: [Refactoring Laravel Visit Analytics: The Path to Version 2.0.0](https://oleant.dev/en/blog/refactoring-laravel-visit-analytics-the-path-to-version-200) )\n\nOnce upon a time, we were young and naive. We believed in the `User-Agent`\n\nstring with all our hearts. We looked at it like a passport: *\"Oh, is that Chrome 128 on Windows 11? Welcome, honored user!\"* But the statistics from our VisitAnalytics package quickly knocked that romantic nonsense right out of us.\n\nWe began to see strange patterns: thousands of \"different\" devices visiting the site, all with perfectly calibrated, \"squeaky-clean\" UA strings. But upon closer inspection, it turned out that the behavior of these \"people\" was suspiciously uniform. They were like soldiers in identical uniforms, marching through a desert where there was no one else but them.\n\nWe didn’t jump straight to active defense. At first, we just started collecting data. Our gut told us that not all users were who they seemed to be. Bots had evolved, learning to spoof their User-Agent strings so well that they were indistinguishable from real browsers. But they had an Achilles' heel: Client Hints (the `Sec-CH-*`\n\nheaders).\n\n**Humans don’t \"optimize\" headers.**\n\nA real user's browser sends a whole bunch of Sec-CH-* headers automatically: from engine version to processor architecture. This is \"living\" information that changes along with updates. Furthermore, the `\"Accept-Language\":\"en-US,en;q=0.9,fr;q=0.8,es;q=0.7\"`\n\nheader of a normal human being differs from the bot equivalent `\"accept-language\":\"en-US,en;q=0.9\"`\n\n.\n\n**Bots are lazy or overthink it.**\n\nAnalyzing our package's statistics, we noticed: bot creators either forget about `Sec-CH-*`\n\nentirely, leaving a void where a whole stack of data should be, or they \"over-optimize\" them. They try to generate them programmatically, leading to logical inconsistencies. It’s like a person in a tuxedo wearing rubber boots: individually, it’s all fine, but together, it makes you question the \"tailor.\"\n\nHere are two examples from the log. The first is a typical human visit:\n\n```\n{\n  \"id\": 1234,\n  \"ip_address\": \"2003:c1:d71c:fe1f::\",\n  \"user_agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36\",\n  \"target_headers\": \"{\\\"sec-ch-ua\\\":\\\"\\\\\\\"Not)A;Brand\\\\\\\";v=\\\\\\\"8\\\\\\\", \\\\\\\"Chromium\\\\\\\";v=\\\\\\\"138\\\\\\\", \\\\\\\"Google Chrome\\\\\\\";v=\\\\\\\"138\\\\\\\"\\\",\\\"sec-ch-ua-platform\\\":\\\"\\\\\\\"Windows\\\\\\\"\\\",\\\"sec-ch-ua-mobile\\\":\\\"?0\\\",\\\"sec-fetch-site\\\":\\\"none\\\",\\\"sec-fetch-dest\\\":\\\"document\\\",\\\"sec-fetch-mode\\\":\\\"navigate\\\",\\\"accept-language\\\":\\\"en-US,en;q=0.9,fr;q=0.8,es;q=0.7\\\",\\\"accept-encoding\\\":\\\"gzip, br\\\"}\",\n  \"url\": \"https://oleant.dev/blog/freelancer-vertrage-fur-webentwickler-in-deutschland-so-schutzt-du-dich-rechtlich\",\n  \"referer\": \"www.google.com\",\n  \"payload\": null,\n  \"processed_at\": \"2026-05-25 10:10:03\",\n  \"anonymized_at\": \"2026-05-25 11:10:02\",\n  \"bot_score\": 15,\n  \"is_bot\": 0,\n  \"is_official_bot\": 0,\n  \"bot_reasons\": \"[\\\"single_page_scan\\\"]\",\n  \"bot_evidence\": \"{\\\"single_page_scan\\\":{\\\"visit_depth\\\":\\\"1_page_only\\\"},\\\"analyzed_at\\\":\\\"2026-05-25 10:10:03\\\"}\",\n  \"created_at\": \"2026-05-25 10:01:26.133\"\n}\n```\n\nThe only thing is that they didn't browse the site; they only read one article. Now, here is the second example, a clear bot:\n\n```\n{\n  \"id\": 1235,\n  \"ip_address\": \"14.165.179.0\",\n  \"user_agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.84 Safari/537.36\",\n  \"target_headers\": \"{\\\"accept-encoding\\\":\\\"gzip, br\\\"}\",\n  \"url\": \"https://oleant.dev/en/blog/how-to-write-a-resume-for-german-companies\",\n  \"referer\": null,\n  \"payload\": null,\n  \"processed_at\": \"2026-05-25 10:00:02\",\n  \"anonymized_at\": \"2026-05-25 11:00:02\",\n  \"bot_score\": 85,\n  \"is_bot\": 1,\n  \"is_official_bot\": 0,\n  \"bot_reasons\": \"[\\\"suspicious_minimal_headers\\\",\\\"missing_mandatory_header_accept-language\\\",\\\"missing_mandatory_header_sec-fetch-dest\\\",\\\"missing_mandatory_header_sec-fetch-site\\\",\\\"missing_mandatory_header_sec-fetch-mode\\\",\\\"missing_mandatory_header_sec-ch-ua\\\",\\\"missing_mandatory_header_sec-ch-ua-platform\\\",\\\"missing_mandatory_header_sec-ch-ua-mobile\\\"]\",\n  \"bot_evidence\": \"{\\\"suspicious_minimal_headers\\\":{\\\"found_count\\\":1,\\\"required_count\\\":5},\\\"analyzed_at\\\":\\\"2026-05-25 10:00:02\\\"}\",\n  \"created_at\": \"2026-05-25 09:51:05.092\"\n}\n```\n\nThis \"client\" gave themselves away precisely because of the missing headers. But we didn't arrive at this realization immediately—let's look at how our understanding evolved.\n\nWe began to notice that bots give themselves away through internal contradictions. For example, when a User-Agent claims to be Windows, but the `Sec-CH-UA-Platform`\n\nheaders timidly point to Android.\n\nAt that moment, we realized: stop trusting the facade. Statistics showed us that for accurate identification, you shouldn't just read the headers, but look for cognitive dissonance within them. We stopped simply \"recording\" visits and started analyzing their integrity, turning our log files from a simple table into a real dossier on every \"digital chameleon.\" This realization was the first step toward creating a system that later allowed us to move from passive observation to the effective hunting of botnets.\n\nThe User-Agent alone wasn't enough. We quickly understood that bots had learned to mimic virtuosically, swapping this string for any task. However, while observing the logs, we noticed a pattern: botnets often use proxies to rotate IP addresses, hoping to remain unnoticed. But they forget one detail—the \"environment\" of the request.\n\nWe saw that despite constant IP changes (likely through proxy farms), the combination of `User-Agent`\n\n+ `Client Hints`\n\nfor bots is suspiciously stable. It's their signature. They can change their \"face\" (`IP`\n\n), but their \"digital skeleton\" (`headers`\n\n) remains unchanged for the entire network. To expose them, we created a Fingerprint: a unique hash that became our main weapon. In Laravel 11, we implemented this directly in the Middleware TrackVisits , turning a set of headers into a stable identifier:\n\n``` php\n// Hash generation: linking UA with critical headers\n$targetHeaders = $this->extractTargetHeaders($request);\n// Even if the IP changes, the hash content remains a constant for the botnet\n$fingerprintInput = $request->userAgent() . '|' . json_encode($targetHeaders);\n$fingerprintHash = hash('sha256', $fingerprintInput);\n```\n\nWhen the same hash started appearing from 50,000 different IP addresses within an hour, we knew—this is a botnet. Previously, we stored this data in the `botnet_fingerprints`\n\ntable, but it quickly turned into a \"graveyard\" of useless records. We realized: we don't need an archive, we need real-time reactions. We rewrote the `BotnetAnalyzer`\n\nto search for anomalies \"on the fly,\" analyzing activity within the current window:\n\n```\n// Looking for anomalies in the current window without querying archive tables\n$window = now()->subMinutes($params['analysis_window_minutes'] ?? 10);\n// Looking for hash matches from different IP addresses\n$isCluster = VisitLog::where('fingerprint_hash', $log->fingerprint_hash)\n    ->where('created_at', '>=', $window)\n    ->where('ip_address', '!=', $log->ip_address) \n    ->exists();\nif ($isCluster) {\n    // The entire \"pack\" of bots is marked automatically at the moment of appearance\n    $this->markAsBotnet($log->fingerprint_hash);\n}\n```\n\nWhen we removed `botnet_fingerprints`\n\nfrom the database, the system accelerated instantly. We stopped hoarding the history of \"dead\" proxies and moved to detecting the botnet \"conductor\" by their handwriting. If hundreds of different IPs arrive with the same fingerprint—it doesn't matter how often they rotate proxies, we see it's one and the same \"army.\"\n\nThe hunt for the botnet started successfully, but our \"digital trophy room\" began to suffocate us. We were storing every suspicious hash in the `botnet_fingerprints`\n\ntable. With every passing day, it grew like yeast, turning from a security tool into a database bottleneck.\n\nWe realized we had fallen into the \"collection trap.\" We were trying to store attack history when, in reality, we only needed to know what was happening right this second. So, we took a radical step: we deleted `botnet_fingerprints`\n\nfrom our database schema.\n\nThe outcome of our efforts exceeded all expectations:\n\n**Database load dropped by 40%**. Heavy JOINs and endless SELECTs on a table with millions of rows are gone.\n\n**Reaction speed**. Suspicion checks now happen almost instantaneously. Thanks to our first line of header analyzers, 95% of bots are filtered out before reaching more expensive checks (such as network-based PTR record lookups). All 11 analyzers are only passed by humans or as-yet-uncaught bots, which accounts for just a few percent. The rest get their \"brand\" marked by one of the analyzers in the check queue.\n\n**Clear conscience**. We stopped being \"archivists of evil\" and became digital minimalists.\n\nNow, our system no longer suffers from accumulated \"digital baggage.\" It lives in the moment: it analyzes the request, compares it against \"hot\" patterns, and, if necessary, instantly flags the threat. We’ve learned that for botnet protection, it's not the depth of history that matters—it's the speed of decision-making here and now.\n\nWhen we started hunting bots via hashes, we faced an ethical dilemma. That same `fingerprint_hash`\n\nthat helped us identify a botnet had essentially become a \"digital footprint\" of real people. If we hold a hash that can be decrypted or matched back to original headers, we are effectively storing personal data. And we are all about privacy!\n\nThe goal became clear: we need to see botnet activity without seeing the identities of the users.\n\nWe have implemented the `FingerprintAnonymizerService`\n\n. Its logic is simple: at the moment the log is saved, the system retains the data necessary for analytics, but which is ultimately too extensive for true anonymization. After the analytics, once we understand who is before us—human or bot—we no longer need their fingerprints. We pack the bot, along with its fingers and other parts, entirely into solitary confinement, while we welcome the worthy citizen to the site with full honors. All sensitive data is cleaned up in the process. The waiter (the web service) is, after all, not a policeman or security guard; if the guests are already at the disco, they are our guests, and their personal data no longer matters to us, we gladly serve them. But if it is a thief (read: bot), then the thief (aka bot) must sit in jail (a movie quote). He also gets his personal prison number, but without special amenities. And the bot John Johnson becomes simply inmate №245, I hope the analogy is clear.\n\n``` php\npublic function handle(VisitLog $log): array {\n    $updates = [];\n    // Transforming the complex User-Agent into a simple client \"portrait\"\n    if ($config['anonymize_ua'] ?? true) {\n        $updates['user_agent'] = $this->anonymizeUserAgent($log);\n    }\n    // Replacing detailed headers with a simple list of keys\n    if ($config['anonymize_headers'] ?? true) {\n        $updates['target_headers'] = $this->anonymizeHeaders($log->target_headers);\n    }\n    // Wiping the original hash if analytics are complete\n    if ($config['anonymize_fingerprint_hash'] ?? true) {\n        $updates['fingerprint_hash'] = 'anonym-sha256-ready';\n    }\n    return $updates;\n}\n```\n\nThe most interesting transformation happens inside `anonymizeUserAgent`\n\n. We no longer store the raw UA string. Instead, we use Client Hints (if available) to extract general parameters—browser, OS, and platform—and discard unique identifiers.\n\n**Before anonymization:**\n\nWe saw the specific engine version, processor architecture, and a full set of parameters that, combined, could \"fingerprint\" a unique user.\n\n**After anonymization:**\n\nWe see only Chrome / Windows (Desktop).\n\nWe applied a similar approach to headers: the `anonymizeHeaders`\n\nmethod simply returns an array of keys (`array_keys`\n\n), stripping away any values that might contain cookies or specific session tokens. The result? Our logs now look like a set of \"statistical generalizations.\" We still see that a botnet is attacking the site, and we still flag it in the system, but now we are fully protected against accusations of privacy violations. We transformed a detailed trace of every visitor's behavior into a safe stream of aggregated statistics.\n\nNow, even if the log database falls into the wrong hands, it >would appear as cryptographic junk to anyone wanting to de->anonymize our users. This is true engineering minimalism: >protecting the system without harming people.\n\nAll these adventures—from the disappointment in the \"honesty\" of the User-Agent to the deletion of archive tables and the implementation of deep anonymization—culminated in release 2.4.0. This is not just a minor update. It is the transformation of our product from a \"hobbyist detective\" who simply keeps logs into a professional protection system that has learned the internet's most important lesson: you can't trust anyone, not even the headers.\n\n**Performance.**\n\nDitching unnecessary tables and switching to real-time analysis allowed us to reduce database load by **40%** and stop worrying about log scalability issues.\n\n**Privacy-First.**\n\nThanks to the `FingerprintAnonymizerService`\n\n, we are no longer \"secret keepers.\" We analyze threat patterns while leaving users' personal data off our analyzer's radar.\n\n**Smart Detection.**\n\nWe now catch botnets not by IP, but by their \"digital signature,\" making proxy rotation attempts meaningless.\n\nWe continue to evolve. We are already looking into implementing Bloom filters for even lightning-fast, on-the-fly verification of \"suspicious\" fingerprints. Regarding the interface, we have postponed plans for deep integration with Filament for now. Yes, we want to see beautiful dashboards and attack graphs right in the Laravel admin panel, but our current priority is maximum data purity and detection accuracy. Filament is the storefront, but we are still organizing the \"warehouse.\" But rest assured: botnet visualization in the admin panel is only a matter of time, and this item is bolded at the top of our backlog, waiting for its turn.\n\nVersion 2.4.0 is the foundation. We have learned to see the invisible and have cleared the system of excess \"digital trash.\" Onward—faster, bigger, and more efficient.\n\n**Stay tuned, we’re still on the hunt.**", "url": "https://wpnews.pro/news/hunting-digital-chameleons-how-we-defeated-botnets-in-laravel-v2-4-0", "canonical_source": "https://dev.to/alantalex/hunting-digital-chameleons-how-we-defeated-botnets-in-laravel-v240-fha", "published_at": "2026-06-27 07:30:00+00:00", "updated_at": "2026-06-27 07:33:44.010230+00:00", "lang": "en", "topics": ["developer-tools", "machine-learning", "ai-products"], "entities": ["Laravel", "VisitAnalytics", "Chrome", "Windows", "Google"], "alternates": {"html": "https://wpnews.pro/news/hunting-digital-chameleons-how-we-defeated-botnets-in-laravel-v2-4-0", "markdown": "https://wpnews.pro/news/hunting-digital-chameleons-how-we-defeated-botnets-in-laravel-v2-4-0.md", "text": "https://wpnews.pro/news/hunting-digital-chameleons-how-we-defeated-botnets-in-laravel-v2-4-0.txt", "jsonld": "https://wpnews.pro/news/hunting-digital-chameleons-how-we-defeated-botnets-in-laravel-v2-4-0.jsonld"}}