Silo: Using Hashing and Delta Update to Improve Today’s Browsers
On Tuesday Microsoft Researcher James Mickens discussed Silo, a framework for using hashing and delta-updates to dramatically reduce the number of round-trips to the server needed when loading a website. The technology works in today’s browsers without the need for plugins.
Due to flaws in the current design of the HTTP protocol, requests for related files are not automatically batched. Individually requesting files, especially small files, can result in serious performance problems due to network latency. As you know, even networks with a high capacity can suffer latency issues. This can also causes problems with the server if it needs to authenticate and/or authorize each request.
There are three options for reducing the round-trips to the server.
Silo introduces a third option: Use the DOM storage for caching instead of the browser’s cache.
Normally control over caching is not possible with website. There is simply no exposed API beyond the basic hints about cache expiration. But with DOM storage developers have complete control. They cannot only decide if a file is to be cached, they can decide how it is cached.
In these systems, chunk boundaries are induced by special byte sequences in the data. LBFS popularized a chunking method in which applications push a sliding window across a data object and declare a chunk boundary if the Rabin hash value of the window has N zeroes in the lower-order bits. By varying N, applications control the expected chunk size. With content-based hashing, an edit may create or delete several chunks, but it will not cause an unbounded number of chunk invalidations throughout the object.
When cookies are enabled everything can be done in a single round-trip. If the cache if cold, everything needed to render the page is sent in one request. The browser can then break the objects into chunks and cache them. If the cache is warm, a cookie is sent with the list of cache ids (based on the hash of the chunks) is set to the server on the original request. The server uses the cache ids to determine what chunks are needed and sends just those.
Caching is fine-grained. Since it caches chunks instead of whole files only the deltas need to be send when information changes.
You can learn more about Silo on Microsoft Research and Channel 9. Currently there are no active plans to turn this into a product but discussions to that effect are ongoing between Microsoft and Microsoft Research.