HTML5 appcache

john allsopp @johnallsopp webdirections

W3Conf 2011

with the appcache

One of the main critiques of web applications is that you need a connection for them to work. Appcache addresses this, by enabling a site or app to work even when the user is not connected (provided they have of course visited it before).

HTML5 appcache

the appcache manifest

At the heart of appcache is the cache manifest. It's simply a text file, with one or more sections, which specifies what must be cached, what must not be cached, and fallbacks for when resources aren't available.

the appcache manifest

The manifest file must begin with the string "CACHE MANIFEST". It has one or more sections. There are 3 kinds of section.

CACHE sections specify what must always be cached (these resources must always be used from the cache)

Resources in the NETWORK section must never be cached - the online version must always be used.

The FALLBACK section specifies how missing resources should be handled

We can comment our manifest with single line comments. Single line comments simply begin with a #.

the CACHE: section

A CACHE section (there can be more than one of each type of section) begins with the string "CACHE:". It then lists one or more URLs specifically (either absolute, or relative to the cache file). In the CACHE section, we can't use partial URLs or wildcards, only fully resolved URLs.

the CACHE: section

CACHE MANIFEST CACHE: #images /images/image1.png http:/somedomain.com/images/image2.png

Here we have manifest file with one CACHE section, which lists two images to be cached. Because we have only this one section, we could have omitted the CACHE: line.

the NETWORK: section

The NETWORK section specifies resources that must always be used online, and never used from a cache (the appcache, or other browser cache).

Like the CACHE section, it uses absolute and relative URLs to specify resources. Unlike CACHE, it can use partial matches, which specify multiple resources.

the NETWORK: section

There's also a wildcard, which effectively says "any resources not explicitly cached should never be cached".

the NETWORK: section

NETWORK: signup.html payments/pay.html /payments/ *

So, in this rather artificial example (with the wildcard, there's no need to explicitly specify other resources) we have

The resource "signup.html" at the root of the site (but not every file called signup.html)

The specific file called pay.html found relative to the manifest file

any resources located in the directory payments found at the root of the site (including resources found in subdirectories of this directory).

the FALLBACK: section

In the fallback section, we specify resources to be replaced, as well as the resources to replace them. The first of these pairs can be a URL or prefix match pattern. The second must specifically identify a resource to replace any matching the first pattern. There's no wildcard for the FALLBACK section

the FALLBACK: section

FALLBACK: /images/ /images/missing.png / /sorry.html

So, here, anything in the directory images, located at the root of the site (or its subdirectories) which has not been CACHED will be replaced with the image missing.png

CACHE MANIFEST # version 1.0 CACHE: #images /images/image1.png /images/image2.png NETWORK: * FALLBACK: /images/ /images/missing.png

And here's a nice simple appcache manifest. We'll see why I've included a version number shortly

Using the appcache manifest

Now we've created our manifest, how do we use it? We simply link to it from the HTML element of any HTML document, using the manifest attribute. It's recommended that we use the HTML5 doctype for documents which use the appcache.

serving the appcache manifest

Manifest files must be served as text/cache-manifest. Because this is a relatively new specification, and still not entirely stable, servers default settings will likely not include this. Manifests served with the wrong

gotchas & tips

Cache failure

If any of the resources specified in the CACHE section aren't available, nothing will be cached.

Persistence

Effectively, appcaches don't expire. You can't override them with HTTP headers. This has particular implications for developing, with appcaches. We'l look at this in a moment.

stylesheets & other resources

Resources linked to in style sheets or javascript are not cached. Resources must be specifically included in the manifest.

lazy caching

There's no need to add any HTML document which links to a manifest to a manifest, as it will be cached automatically (even when it is included in the network section). This means we don't have to list the pages of our site in the manifest, which would cause them to be cached the first time a visitor visited our site, potentially impacting performance

mime type

Just to reiterate, it is really important to serve manifest files with the correct mimetype. They will not work otherwise

development and appcaching

Because caches persist almost indefinitely, developing with appcaching turned on is really painful, and I strenuously recommend you avoid it.

Here, the fact that manifests served with the wrong mimetype are ignored is your friend. Simply setup the wrong mimetype for manifest files in your server while developing, then, change this to the correct type for production sites.

updating the cache

So, when is the cache updated? When the manifest file is changed. The browser won't reload all cached files however, only those which have changed since they were cached. Using version numbers in a comment can help here. Each time we want the cache to be refreshed, we simply updated the version number

browser support

browser cache size limits

Introducing ManifestR

http://westciv.com/tools/manifestR/

ManifestR

ManifestR

Let's make a manifest

W3Conf.org

Thanks!

Save the contents of the field above to a file in the same folder as this one.