cd /news/large-language-models/you-can-t-always-trust-a-bmc-s-inven… · home topics large-language-models article
[ARTICLE · art-41411] src=utcc.utoronto.ca ↗ pub= topic=large-language-models verified=true sentiment=↓ negative

You can't always trust a BMC's inventory of the server's hardware

A blog administrator reports that high-volume crawlers, including those from Inoreader, Feedly, and archive.* services, are using old browser user agents to scrape content, likely for LLM training, causing the site to block legitimate users. The administrator advises users to update browsers, adjust Vivaldi settings, or contact support to resolve access issues.

read2 min views1 publishedJun 26, 2026

You're probably reading this page because you've attempted to access

some part of [my blog (Wandering Thoughts)](../space/blog/) or
[CSpace](../space/), the wiki thing it's part of. Unfortunately

you're using a browser version that my anti-crawler precautions consider suspicious, most often because it's too old (most often this applies to versions of Chrome). Unfortunately, as of early 2025 there's a plague of high volume crawlers (apparently in part to gather data for LLM training) that use a variety of old browser user agents, especially Chrome user agents. To reduce the load on Wandering Thoughts I'm experimenting with (attempting to) block all of them, and you've run into this.

If this is in error and you're using a current version of your browser of choice, you can contact me at [my current place at the

university](https://www.cs.toronto.edu/~cks/) (you should be able to work out the email address
from that). If possible, please let me know what browser you're
using and so on, ideally with its exact User-Agent string.

I am not blocking Inoreader's feed fetcher or considering it to be too old, and it routinely fetches feeds from me. I don't know why Inoreader is showing you this page. It is possible that they're periodically trying to fetch feeds or pages with an old browser HTTP User-Agent (or an actual old browser) and taking the results of that fetch (this page) as what they should show people instead of the results of their syndication feed fetcher agents. This is a bad mistake today; the results of modern HTTP fetches depend partly on the HTTP User-Agent used.

Much like Inoreader, Feedly is periodically fetching my syndication feeds with a fake, old browser HTTP User-Agent header, which fails, and is then grimly latching on to the results for their actual feed fetching with their regular Feedly HTTP User-Agent. There is nothing I can do about this; you should contact Feedly support, if you can find them. See this comment of mine in Wandering Thoughts for more details.

Due to an ongoing attack, you may need to change the "User Agent Brand Masking" setting so that your Vivaldi identifies itself as Vivaldi, instead of Google Chrome. This applies to even the current version of Vivaldi.

You may be seeing this through archive.today, archive.ph, archive.is, and so on. Unfortunately, archive.* crawls pages to archive in a way that is impossible to distinguish from malicious actors. They use old Chrome User-Agent values, crawl from IP address blocks that are widely distributed and not clearly identified as theirs, and some of their IP addresses have falsified reverse DNS entries that claim they are googlebot IP addresses (which is something that is normally done only by quite bad actors). I suggest that you use archive.org, which is a better behaved archival crawler and can crawl my blog (Wandering Thoughts).

── more in #large-language-models 4 stories · sorted by recency
── more on @inoreader 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/you-can-t-always-tru…] indexed:0 read:2min 2026-06-26 ·