{"slug": "experimenting-with-the-proposed-cross-origin-storage-api-in-transformers-js", "title": "Experimenting with the proposed Cross-Origin Storage API in Transformers.js", "summary": "Google Chrome engineer Thomas Steiner is experimenting with the proposed Cross-Origin Storage API to enable Transformers.js models to be shared across different origins, reducing redundant downloads and storage. Currently, when multiple web apps use the same AI model, each must download and cache it separately, leading to significant bandwidth and storage waste. The new API aims to allow cross-origin caching of model resources and WebAssembly runtime files.", "body_md": "Text Classification • Updated • 22k • 13\n\n# Experimenting with the proposed Cross-Origin Storage API in Transformers.js\n\n[Update on GitHub](https://github.com/huggingface/blog/blob/main/cross-origin-storage.md)\n\n[Thomas Steiner](https://blog.tomayac.com/)from the Chrome team at Google.)\n\nTransformers.js provides Web developers with a simple way to use the power of transformers in their Web apps through task-specific pipelines. To run inference in the browser, developers create an instance of [ pipeline()](https://huggingface.co/docs/transformers.js/en/api/pipelines) and specify a task they want to use the pipeline for. As a concrete example, the following snippet shows how to set up an automatic speech recognition (ASR) pipeline.\n\n``` js\nimport { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@4.2.0';\n\nconst asr = await pipeline(\n  'automatic-speech-recognition',\n  'Xenova/whisper-tiny.en',\n  { device: 'webgpu' },\n);\nconst result = await asr('jfk.wav');\nconsole.log(result);\n```\n\n## The cache challenge\n\nYou will notice in the source code that I specified [ Xenova/whisper-tiny.en](https://huggingface.co/Xenova/whisper-tiny.en) as the model, which is a very decent choice for common English automatic speech recognition tasks. In fact, it's even\n\n*the*default model according to the Transformers.js\n\n[default model resolution](https://github.com/huggingface/transformers.js/blob/main/packages/transformers/src/pipelines/index.js), as per the linked\n\n[excerpt](https://github.com/huggingface/transformers.js/blob/bc9cf7400f4f2c8695016699f879e31026ff0313/packages/transformers/src/pipelines/index.js#L151-L158).\n\n### Model resources\n\nWhen you [run this example in the browser](https://googlechrome.github.io/samples/transformersjs-automatic-speech-recognition/index.html), Transformers.js automatically takes care of downloading and caching the relevant model resources and Wasm files. The following screenshot shows the Chrome DevTools [Cache storage](https://developer.chrome.com/docs/devtools/storage/cache) section after visiting the app. When you reload the page, the resources are served from the [Cache API](https://developer.mozilla.org/en-US/docs/Web/API/Cache), and the model returns results almost instantly.\n\nHowever, `Xenova/whisper-tiny.en`\n\nbeing a popular model (and, as mentioned before, even being *the* ASR default model in Transformers.js), you can well imagine that more than just one app that you visit would use it. To simulate this situation, here's the same example app from before, but served from a [different origin](https://rawcdn.rawgit.net/GoogleChrome/samples/c4192bd7a3c66fc288a7b22b77acb935df00b8a1/transformersjs-automatic-speech-recognition/index.html). When you visit this different origin app, rather than being usable almost instantly, the browser instead has to download and cache all the model resources again, even if they're byte-by-byte the same as before. Even in this toy example, this adds up to 177 MB of duplicate download and storage, as you can examine in the **Storage** section of the Chrome DevTools [Application panel](https://developer.chrome.com/docs/devtools/application#open_the_application_panel). You can imagine that this quickly adds up.\n\n### Wasm runtime resources\n\nBut it gets worse. Let's add a second pipeline to the toy example: sentiment analysis. Sentiment analysis [by default](https://github.com/huggingface/transformers.js/blob/bc9cf7400f4f2c8695016699f879e31026ff0313/packages/transformers/src/pipelines/index.js#L65) uses the [ Xenova/distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/Xenova/distilbert-base-uncased-finetuned-sst-2-english) model. By not specifying the model, Transformers.js' default model resolution automatically picks it for you.\n\n``` js\nconst classifier = await pipeline('sentiment-analysis');\nconst sentiment = await classifier(result.text);\npre.append('\\n\\n' + JSON.stringify(sentiment, null, 2));\n```\n\nTwo entirely different AI models, but they depend on the same 4,733 kB `ort-wasm-simd-threaded.asyncify.wasm`\n\nWebAssembly (Wasm) runtime file [from the underlying ONNX Runtime library](https://onnxruntime.ai/docs/api/js/interfaces/Env.WasmFilePaths.html#wasm) that Transformers.js is built on top of. Open the [extended demo on a different origin](https://rawcdn.rawgit.net/GoogleChrome/samples/d47114a15637383015c274e7bdcd81e1a17b0ccf/transformersjs-automatic-speech-recognition/index2.html), and you will notice in the [ Network tab](https://developer.chrome.com/docs/devtools/network#load) how also the Wasm runtime gets downloaded and cached again.\n\nSo even if you run apps that don't share the same AI models, your browser still makes redundant requests for shared Wasm resources you already have, and on top of that also caches them again, which consumes space on your hard disk.\n\n### Cache isolation\n\n#### AI model resources serving\n\nBy default, **AI model resources** come from the [Hugging Face Hub](https://huggingface.co/docs/hub/en/models-the-hub), and ultimately the Hugging Face CDN. The browser makes a request for a resource like [ https://huggingface.co/Xenova/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/config.json](https://huggingface.co/Xenova/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/config.json) which then gets redirected to the final CDN URL like\n\n[in this case.](https://huggingface.co/api/resolve-cache/models/Xenova/distilbert-base-uncased-finetuned-sst-2-english/0b6928efcb76139cae2c6881d49cda67fe119f42/config.json?%2FXenova%2Fdistilbert-base-uncased-finetuned-sst-2-english%2Fresolve%2Fmain%2Fconfig.json=&etag=%223c36342ef1f74de2797d667c68c6b7b988d0b87c%22)\n\n`https://huggingface.co/api/resolve-cache/models/Xenova/distilbert-base-uncased-finetuned-sst-2-english/0b6928efcb76139cae2c6881d49cda67fe119f42/config.json?%2FXenova%2Fdistilbert-base-uncased-finetuned-sst-2-english%2Fresolve%2Fmain%2Fconfig.json=&etag=%223c36342ef1f74de2797d667c68c6b7b988d0b87c%22`\n\n#### Wasm runtime resources serving\n\nThe **Wasm runtime resources** are served from the [jsDelivr CDN](https://www.jsdelivr.com/) by default. For example, `ort-wasm-simd-threaded.asyncify.wasm`\n\ncomes from [ https://cdn.jsdelivr.net/npm/onnxruntime-web@1.26.0-dev.20260416-b7804b056c/dist/ort-wasm-simd-threaded.asyncify.wasm](https://cdn.jsdelivr.net/npm/onnxruntime-web@1.26.0-dev.20260416-b7804b056c/dist/ort-wasm-simd-threaded.asyncify.wasm) at the time of this writing.\n\nNow you may say that if different apps, even though running on different origins, in the end serve their resources from the same CDN URLs, caching shouldn't be a problem, as long as the final URLs are the same. Unfortunately, this is not how caching works in browsers for a long time. The article [Gaining security and privacy by partitioning the cache](https://developer.chrome.com/blog/http-cache-partitioning) goes into all the details, but essentially, **caches are isolated by origin** to prevent timing attacks: the time a website takes to respond to HTTP requests can reveal that the browser has accessed the same resource in the past, which makes the browser vulnerable to security and privacy leaks.\n\n#### Chrome's implementation\n\nThe concrete implementation may vary by browser, but in Chrome, cached resources are keyed using a Network Isolation Key in addition to the **resource URL**. The Network Isolation Key is composed of the **top-level site** and the **current-frame site**. Take the previous toy examples hosted on the origins `https://googlechrome.github.io`\n\nand `https://rawcdn.rawgit.net`\n\n. If they both use the Wasm runtime from `https://cdn.jsdelivr.net/npm/onnxruntime-web@1.26.0-dev.20260416-b7804b056c/dist/ort-wasm-simd-threaded.asyncify.wasm`\n\n, their cache keys will look like in the following table.\n\n| Network Isolation Key | Resource URL |\n|\n|---|---|---|\nTop-level site |\nCurrent-frame site |\n|\n\n```\nhttps://googlechrome.github.io\n```\n\n |\n\n```\nhttps://googlechrome.github.io\n```\n\n |\n\n```\nhttps://cdn.jsdelivr.net/npm/onnxruntime-web@1.26.0-dev.20260416-b7804b056c/dist/ort-wasm-simd-threaded.asyncify.wasm\n```\n\n |\n\n```\nhttps://rawcdn.rawgit.net\n```\n\n |\n\n```\nhttps://rawcdn.rawgit.net\n```\n\n |\n\n```\nhttps://cdn.jsdelivr.net/npm/onnxruntime-web@1.26.0-dev.20260416-b7804b056c/dist/ort-wasm-simd-threaded.asyncify.wasm\n```\n\n |\n\nSo even if the resource URLs are exactly the same, since the Network Isolation Keys don't match, there's no cache hit, which means duplicate download and duplicate storage. This is the challenge that the Cross-Origin Storage proposal aims to tackle.\n\n## Enter the Cross-Origin Storage API\n\n💡 Note:The Cross-Origin Storage API is an early-stage proposal that isn't final. While the proposed API is not yet natively implemented in any browser, you don't have to wait to experiment with it. Install the[Cross-Origin Storage extension]to inject the`navigator.crossOriginStorage`\n\npolyfill on all pages and test the complete flow.\n\nThe proposed ** Cross-Origin Storage (COS) API** introduces a dedicated\n\n`navigator.crossOriginStorage`\n\ninterface through which web apps can store and retrieve large files across origin boundaries, identified not by a URL, but by a cryptographic hash.That last point about cryptographic hashes is key. Because COS identifies files by their **hash** rather than by their URL or origin, the same `ort-wasm-simd-threaded.asyncify.wasm`\n\nWasm runtime you downloaded while visiting `https://googlechrome.github.io`\n\nis recognized as identical to the one `https://rawcdn.rawgit.net`\n\nis about to request, no matter where either of the two origins fetched it from. See the following code snippet that illustrates the basic flow.\n\n``` js\nconst hash = {\n  algorithm: 'SHA-256',\n  value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',\n};\n\ntry {\n  const handle = await navigator.crossOriginStorage.requestFileHandle(hash);\n  // Cache hit! Get the file as a Blob and use it directly.\n  const fileBlob = await handle.getFile();\n} catch (err) {\n  // Cache miss. Download from network, then store for next time.\n  const fileBlob = await fetch('https://cdn.jsdelivr.net/.../ort-wasm-simd-threaded.asyncify.wasm')\n    .then(r => r.blob());\n  const handle = await navigator.crossOriginStorage.requestFileHandle(\n    hash,\n    { create: true, origins: '*' },\n  );\n  const writableStream = await handle.createWritable();\n  await writableStream.write(fileBlob);\n  await writableStream.close();  \n}\n```\n\nIf the resource is in COS, you get back a [ FileSystemFileHandle](https://developer.mozilla.org/en-US/docs/Web/API/FileSystemFileHandle) from which you can read the blob directly via\n\n[(the resulting](https://developer.mozilla.org/en-US/docs/Web/API/FileSystemFileHandle/getFile)\n\n`getFile()`\n\n[inherits from](https://developer.mozilla.org/en-US/docs/Web/API/File)\n\n`File`\n\n[). If the resource is not in COS, you fall back to the network, and write the resource into COS for the next app that needs it, which could be your app, or another unrelated app, potentially on a completely different origin.](https://developer.mozilla.org/en-US/docs/Web/API/Blob)\n\n`Blob`\n\nThe API is deliberately shaped after the [File System Standard](https://fs.spec.whatwg.org/)'s [ FileSystemDirectoryHandle.getFileHandle()](https://developer.mozilla.org/en-US/docs/Web/API/FileSystemDirectoryHandle/getFileHandle) you likely are familiar with from the\n\n[Origin Private File System](https://developer.mozilla.org/en-US/docs/Web/API/File_System_API/Origin_private_file_system)(OPFS) API. The\n\n`hash`\n\nparameter plays the same role as the `name`\n\nparameter in OPFS: uniquely identifying a resource. The `options.create`\n\nflag works the same way: absent or `false`\n\nfor read-only access, `true`\n\nwhen you intend to write.###\n\nControl who can read what\n\nNot every resource should be globally shared. COS gives developers precise control over visibility through the `origins`\n\noption when storing a file.\n\n- Setting\n`origins: '*'`\n\nmakes a file**globally available**. Any origin can find it by hash. This is the right choice for AI model resources or the Wasm runtime in the Transformers.js example: the whole point is that every app on the Web benefits from a single cached copy. - Passing a specific list of origins, like\n`origins: ['https://write.example.com', 'https://calculate.example.com']`\n\n,**restricts** access to those sites. This works well for proprietary resources shared across a company's own properties that shouldn't be discoverable by anyone else, like a proprietary proofreading AI model used in a commercial office suite. - Omitting\n`origins`\n\nentirely makes the file available only to. This is a sensible default for resources shared across all of an organization's subdomains, but not intended to cross organizational boundaries.[same-site](https://web.dev/articles/same-site-same-origin#same-site-cross-site)origins\n\nOne important rule: visibility can be upgraded but never downgraded. If a file is already globally available, a later attempt to store it with a restricted `origins`\n\nlist is silently ignored. This prevents a malicious actor from re-storing a public resource and narrowing its availability. The reverse is possible: a file initially stored with a restricted `origins`\n\nlist can later be made more permissive. Any site, not just the original storer, can call `requestFileHandle()`\n\nfor the same hash (hashes are not a secret) with `create: true`\n\nand a broader `origins`\n\nvalue, and given the browser verifies the hash matches, the resource becomes available to the wider audience from that point on. Note that the upgrading site **must** still write the full file through the returned handle. This requirement exists to prevent sites from exploiting the upgrade path as a side-channel to detect whether a particular file was already stored in COS.\n\n### Integrity by design\n\nA subtle but important property of COS is that the browser **verifies the hash** when you write a file. If the data you write doesn't match the declared hash, the write fails with an error. This makes integrity checking automatic: an app reading a file from COS can be confident it's getting exactly the bytes it expected. The same guarantee it would have had if it had computed the hash itself after a network download.\n\nThis turns out to be doubly useful in the Transformers.js scenario. Today, after downloading model weights, most apps have no practical way to verify that the CDN served the right bytes. With COS, every file in the store is implicitly verified on write, no matter where it came from, the official Hugging Face CDN or a random site's self-hosted mirror.\n\n### Privacy without sacrificing utility\n\nOf course a cross-origin shared cache raises the same question as the partitioned HTTP cache in reverse: if any site can probe for the presence of a file by hash, couldn't an attacker learn something about the user's browsing history by checking whether, say, a game engine Wasm module is cached?\n\nCOS addresses this through two complementary mechanisms:\n\n- First, the\n`origins`\n\nfield: proprietary resources that shouldn't be globally probeable simply shouldn't be stored with`origins: '*'`\n\n, which, through**developer education**, developers are encouraged to consider whenever it makes sense. - Second,\n**availability gating**: even for globally declared files, the browser may suppress confirmation of a file's presence if it hasn't been encountered across a sufficient number of distinct origins. A file that only appears on one or two sites could still serve as a cross-site identifier, so the browser may return an error as if the file weren't there at all, regardless of what's physically on disk. On the Chrome team, we are conscious of the possible privacy leaks uncommon resources could cause and plan generally to mitigate it through restricting which exact resources can be cached. The concrete mitigations are still being fleshed out.\n\nCrucially, this means an error is not a definitive answer. It might mean \"not stored\", or it might mean \"stored, but the browser isn't telling you\". Apps should always handle it the same way: fall back to the network.\n\n### What this means for the Transformers.js example\n\nGoing back to the toy examples from before: the `ort-wasm-simd-threaded.asyncify.wasm`\n\nruntime weighs in at 4,733 kB and is shared by every Transformers.js-powered app regardless of which AI model it uses. With COS, the first app to load it downloads it once and stores it under its SHA-256 hash with `origins: '*'`\n\n. Every subsequent app, whether on `https://googlechrome.github.io`\n\n, on `https://rawcdn.rawgit.net`\n\n, or any other origin, finds it in COS immediately. The 177 MB of duplicate Whisper model weights? Same story: `Xenova/whisper-tiny.en`\n\ngets downloaded once, recognized by hash the second time around, and served from COS in milliseconds. And of course, the same also happens for `Xenova/distilbert-base-uncased-finetuned-sst-2-english`\n\n.\n\nTransformers.js itself is already piloting the COS API at the library level. [Pull request #1549](https://github.com/huggingface/transformers.js/pull/1549) introduced an experimental COS cache backend behind an opt-in flag. Enabling it takes a single line before you set up your pipeline:\n\n``` js\nimport { env, pipeline } from \"https://cdn.jsdelivr.net/npm/@huggingface/transformers@4.2.0\";\n\n// 👇 Opt in to the experimental Cross-Origin Storage cache backend.\nenv.experimental_useCrossOriginStorage = true;\n\nconst asr = await pipeline('automatic-speech-recognition', 'Xenova/whisper-tiny.en', { device: 'webgpu' });\nconst result = await asr('jfk.wav');\nconsole.log(result);\n```\n\nWith that flag set, Transformers.js resolves the SHA-256 hash for each [Xet-tracked](https://huggingface.co/docs/hub/en/xet/index) model file (the large ONNX weight files) by fetching the raw Xet pointer ([example raw pointer file](https://huggingface.co/Xenova/whisper-tiny.en/raw/main/onnx/decoder_model.onnx)) and extracting its `oid sha256:`\n\nfield. It then uses that hash as the key for `navigator.crossOriginStorage`\n\n. If the model is already in COS (because another site stored it there first), it's served instantly without a network round-trip. If not, it falls back to a regular download and stores the result in COS for the next caller. With the toy example, the advantage in practice is that `Xenova/whisper-tiny.en`\n\nand `Xenova/distilbert-base-uncased-finetuned-sst-2-english`\n\n(and of course `ort-wasm-simd-threaded.asyncify.wasm`\n\n) only ever need to cross the ether once, regardless of how many different origins ask for them.\n\nNote the `experimental_`\n\nprefix on the flag. It's intentional and signals that the underlying browser API has not yet been standardized and may change without a major version bump.\n\n### Try it today\n\nThe COS API is not yet natively implemented in any browser, but you don't have to wait to experiment with it. Install the [Cross-Origin Storage extension](https://chromewebstore.google.com/detail/cross-origin-storage/denpnpcgjgikjpoglpjefakmdcbmlgih) to inject the `navigator.crossOriginStorage`\n\npolyfill on all pages and test the complete flow. You can inspect the [source code of the extension](https://github.com/web-ai-community/cross-origin-storage-extension) and follow the [usage instructions](https://github.com/web-ai-community/cross-origin-storage-extension?tab=readme-ov-file#usage) to get started.\n\nWith the extension installed, you can try the full end-to-end experience right now: open the first [toy example with COS enabled](https://googlechrome.github.io/samples/transformersjs-automatic-speech-recognition/index3.html), let it load `Xenova/whisper-tiny.en`\n\n, then open the [toy example with COS enabled from the second origin](https://rawcdn.rawgit.net/GoogleChrome/samples/1e4f2b8c10adc394352c6ec8327bb503bac7aba1/transformersjs-automatic-speech-recognition/index3.html). Instead of the 177 MB re-download you saw before, the model is served from COS in milliseconds. When you open the extension's popup window, you can see COS in action. If you **View by Resource**, you can see the resource with the SHA-256 hash `950978b1dbcbf250335358c1236053ba19a7f7849b33dc777f4421b72b7626fa`\n\nshared across `https://googlechrome.github.io`\n\nand `https://rawcdn.rawgit.net`\n\n. It may not be obvious, but as you can verify by comparing the SHA-256 hash on Hugging Face, you're looking at [ https://huggingface.co/Xenova/whisper-tiny.en/blob/main/onnx/decoder_model_merged.onnx](https://huggingface.co/Xenova/whisper-tiny.en/blob/main/onnx/decoder_model_merged.onnx). For now, the extension is mostly aimed at power users like you. Once implemented in the browser, there will be a friendlier integration in the browser's\n\n**Settings** page. The screenshot below shows the extension's popup window with the\n\n**View by Resource** tab active, where you can see the shared resource with its hash and the two origins that have it in their COS cache.\n\n## Call to action\n\nIf you're building your own Transformers.js app, the call to action is simple: add `env.experimental_useCrossOriginStorage = true`\n\nbefore your first `pipeline()`\n\ncall, install the extension, and watch the duplicate downloads disappear from your Network tab. Every site that opts in makes the experience faster and cheaper for every other site's users. Opting in is completely risk-free: if the COS API isn't supported because the user doesn't have the COS extension installed, the code just falls back to the default path (the [Web Cache](https://developer.mozilla.org/en-US/docs/Web/API/Cache) API).\n\nTransformers.js is not alone in experimenting with COS. [WebLLM](https://webllm.mlc.ai/) (opt-in, see [documentation](https://webllm.mlc.ai/docs/user/advanced_usage.html#using-cross-origin-storage-cache)) and [wllama](https://github.com/ngxson/wllama) (automatic, see [PR](https://github.com/ngxson/wllama/pull/248)) are likewise excited about this proposed API.\n\nOn the Chrome team, we're [considering implementing the COS API](https://chromestatus.com/feature/5163371507875840) natively in the browser. As an early stage proposal, we welcome feedback on the API, and the shape of the proposal itself. The [Cross-Origin Storage repository](https://github.com/WICG/cross-origin-storage) is the place to file issues, [express support](https://github.com/WICG/cross-origin-storage/labels/expression%20of%20support), or open PRs.", "url": "https://wpnews.pro/news/experimenting-with-the-proposed-cross-origin-storage-api-in-transformers-js", "canonical_source": "https://huggingface.co/blog/cross-origin-storage", "published_at": "2026-06-23 00:00:00+00:00", "updated_at": "2026-06-23 23:49:54.315408+00:00", "lang": "en", "topics": ["machine-learning", "ai-tools", "developer-tools"], "entities": ["Google", "Chrome", "Thomas Steiner", "Transformers.js", "Hugging Face", "Xenova/whisper-tiny.en", "Cache API", "WebAssembly"], "alternates": {"html": "https://wpnews.pro/news/experimenting-with-the-proposed-cross-origin-storage-api-in-transformers-js", "markdown": "https://wpnews.pro/news/experimenting-with-the-proposed-cross-origin-storage-api-in-transformers-js.md", "text": "https://wpnews.pro/news/experimenting-with-the-proposed-cross-origin-storage-api-in-transformers-js.txt", "jsonld": "https://wpnews.pro/news/experimenting-with-the-proposed-cross-origin-storage-api-in-transformers-js.jsonld"}}