AI crawlers are inflating your view counts

AI crawlers are inflating web view counts by making hundreds of thousands of requests per day, skewing analytics for human engagement. A developer fixed the issue by moving tracking to client-side JavaScript and adding robots.txt and bot detection guards, but crawlers later found and hit the tracking endpoint directly.

Your most-viewed page might be one no human has ever opened. That is what AI crawlers have done to view tracking in 2026. I ran into this problem on a production app that needed engagement tracking. The first version tracked everything server-side, the way Rails apps have done analytics for years. It broke within a day. The problem: crawlers inflate every count the-problem-crawlers-inflate-every-count We used Ahoy https://github.com/ankane/ahoy for tracking. Each controller action called ahoy.track while rendering the page, and every event rolled up into a denormalized counter column with counter culture . The issue is that server-side tracking fires on every request, including bots. AI crawlers like Meta-ExternalAgent, Bytespider, and Baiduspider were making roughly 100,000 requests per day. They were not attacking the site, just reading to feed training pipelines. Ahoy has bot detection built in. It uses the device detector gem to check user agents and skips known bots. That list catches Googlebot and older crawlers, but it misses the new wave of AI crawlers. As a result, every one of those requests created an Ahoy::Event row and incremented the corresponding counters. Our view counts were not measuring human interest. They were measuring how hungry the scrapers were that week. Fix one: require JavaScript fix-one-require-javascript Chasing user agent strings is a losing game. New crawlers appear faster than blocklists update. But there is one thing AI crawlers reliably do not do, and that is execute JavaScript. So we moved view tracking out of the controllers. Pages declare what is trackable as a data attribute, and a small Stimulus controller fires a beacon after the page loads. connect { if this.element.dataset.viewTrackerFired === "true" return this.element.dataset.viewTrackerFired = "true" const fire = = this.fire if "requestIdleCallback" in window { requestIdleCallback fire, { timeout: 2000 } } else { setTimeout fire, 500 } } A few details mattered here: requestIdleCallback defers the beacon until the browser is idle, so tracking never competes with rendering. The 2-second timeout guarantees it still fires on busy pages. keepalive: true on the fetch lets the request survive the user navigating away immediately.- The fired flag guards against Turbo reconnecting the controller and double-counting. Crawlers fetch the HTML and move on. Real browsers run the beacon and get counted. View counts dropped sharply the day this deployed. That was the fix landing, not a regression. Fix two: the bots found the beacon fix-two-the-bots-found-the-beacon Three days later, the tracking endpoint /track/events was the most-crawled path on the site. Crawlers do not execute JavaScript, but they do parse it. The endpoint URL sits in the markup as a data attribute, so the scrapers extracted it and started requesting it directly. None of those requests created events, but they still burned through the full Rails stack for nothing. The fix was two cheap layers. First, robots.txt for the well-behaved bots: Disallow: /track/ Second, a guard in the controller for everyone else: class TrackingEventsController < ApplicationController before action :reject bots private def reject bots head :no content if DeviceDetector.new request.user agent .bot? end end Any request with a bot user agent gets a 204 before the action runs. No parsing, no resource lookups, no database work. The well-behaved crawlers respect robots.txt and never arrive, and the rest get the cheapest possible response. The takeaway the-takeaway Server-side analytics was built for a web that no longer exists. In 2026, a meaningful share of your traffic comes from AI crawlers, so counting views on the server measures scraper appetite, not audience. The defense is not one clever trick. It is stacked cheap layers: robots.txt for the bots that ask permission, a user agent check that returns early for the ones that announce themselves, and a JavaScript beacon for the bots that do neither. Check your own numbers. If your view counts have never had a suspicious cliff in them, the bot tax is probably still baked in.