
The world of web development is constantly evolving, and with it, the tools we use to build increasingly sophisticated applications. Transformers.js, a powerful library, offers web developers a straightforward path to integrate advanced transformer models into their web applications. It leverages task-specific pipelines, making complex AI tasks like automatic speech recognition (ASR) accessible directly in the browser.
For instance, setting up an ASR pipeline is as simple as creating a pipeline instance and specifying the task. A common choice for English ASR is the Xenova/whisper-tiny.en model. When you run an example like this, Transformers.js intelligently handles the downloading and caching of all necessary model resources and WebAssembly (Wasm) files.
The Hidden Cache Challenge
Upon its first run, your browser’s Cache Storage will populate with these resources. Subsequent visits to the same application will load them almost instantly, thanks to the Cache API. This efficiency is great for individual applications, but what happens when multiple applications on different websites utilize the exact same model?
This is where the cache challenge emerges. If you visit a different website that also uses the popular Xenova/whisper-tiny.en model, your browser will download and cache the same 177 MB of model resources again. Even if the files are byte-for-byte identical, the browser treats them as new downloads, leading to redundant data transfers and storage consumption. This isn’t just about AI models; Wasm runtime files, like the 4,733 kB ort-wasm-simd-threaded.asyncify.wasm file used by the underlying ONNX Runtime library, face the same issue.
Even if two AI models are entirely different, they might depend on the same foundational Wasm runtime. So, if your browser encounters these shared Wasm resources from different origins, it re-downloads and re-caches them every time. This behavior, while seemingly inefficient, is a critical security measure.
Why Caches Are Isolated
You might wonder why browsers don’t simply reuse cached resources from the same CDN, even if they’re accessed from different websites. The answer lies in security and privacy. For a long time, browser caches have been isolated by origin to prevent “timing attacks.”
These attacks could exploit the time it takes for a website to load resources, revealing whether your browser has previously accessed certain content. This could expose your browsing history and personal information. To combat this, Chrome, for example, uses a Network Isolation Key in addition to the resource URL when caching. This key comprises both the top-level site and the current-frame site, ensuring that resources from the same URL are only reused if the full context matches.
Consequently, even if two different origins request the identical Wasm runtime from the same CDN, their unique Network Isolation Keys prevent a cache hit. This results in duplicate downloads and storage, highlighting the core problem the Cross-Origin Storage API aims to solve.
Introducing the Cross-Origin Storage API
The proposed Cross-Origin Storage (COS) API is an exciting, early-stage solution designed to tackle this very problem. While not yet natively implemented in browsers, you can experiment with it using a dedicated extension that polyfills the navigator.crossOriginStorage interface. COS introduces a novel way for web apps to store and retrieve large files across different origins, not by their URL, but by their cryptographic hash.
This hash-based identification is a game-changer. It means that the ort-wasm-simd-threaded.asyncify.wasm file, once downloaded and stored by one origin, can be recognized as identical by any other origin that requests it, regardless of where they originally fetched it from. If a resource is already in COS, the application receives a FileSystemFileHandle to directly access the file. If not, it falls back to a network download, then stores the resource in COS for future use by any application, even unrelated ones.
The API is thoughtfully modeled after the File System Standard’s FileSystemDirectoryHandle.getFileHandle(). The hash parameter serves the same purpose as the name parameter in the Origin Private File System (OPFS), uniquely identifying a resource. Similarly, the options.create flag functions identically, allowing read-only access by default or enabling write access when set to true.
- Integrity by Design: A key feature of COS is its inherent integrity checking. When a file is written, the browser verifies its data against the declared hash. Mismatches result in an error, guaranteeing that any file read from COS is exactly what was expected. This is particularly beneficial for Transformers.js, as it provides automatic verification for model weights, regardless of their source.
- Privacy Considerations: Sharing resources across origins naturally raises privacy concerns. COS addresses this by obfuscating whether a file is present. If a request for a file’s hash doesn’t result in a direct error, it means the browser *might* have the file, but it won’t explicitly confirm it. This prevents timing attacks by forcing apps to always fall back to the network if COS doesn’t immediately provide the resource.
With COS, the 4,733 kB Wasm runtime, essential for every Transformers.js-powered app, would be downloaded just once. The first app to load it would store it under its SHA-256 hash with global visibility (origins: '*'). Every subsequent app, regardless of its origin, would then find it in COS immediately, eliminating redundant downloads. The same principle applies to the 177 MB Whisper model weights, effectively turning what was once duplicate download and storage into a single, efficient operation.
Source: Hugging Face Blog