Recently I’ve been playing around with Netlify and as a result I’m becoming more familiar with caching strategies commonly found with content delivery networks (CDN). One such strategy makes use of ETag identifiers for web resources.
In short, an ETag identifier is a value, typically a hash, that represents the version of a particular web resource. The resource is cached within the browser along with the ETag value and that value is used when determining if the particular cached resource has changed remotely.
We’re going to explore how to simulate the requests that the browser makes when working with ETag identifiers, but using simple cURL requests instead.
To get started, we’re going to make a request for a resource:
$ curl -I https://www.thepolyglotdeveloper.com/css/custom.min.css
HTTP/2 200
accept-ranges: bytes
cache-control: public, max-age=0, must-revalidate
content-length: 7328
content-type: text/css; charset=UTF-8
date: Wed, 04 Sep 2019 00:41:04 GMT
strict-transport-security: max-age=31536000
etag: "018b8b0ecb632aab770af328f043b119-ssl"
age: 0
server: Netlify
x-nf-request-id: 65a8e1aa-03a0-4b6c-9f46-51aba795ad83-921013
In the above request I’ve only requested the header information from the response. For this tutorial, the body of the response isn’t important to us.
Take note of the cache-control
and the etag
headers as well as the response code.
In the scenario of Netlify, the cache-control
header tells the browser to cache the resource, but also not to trust the cache. This is done so the client always attempts to get the latest resource. The etag
header represents the version of the resource and it is sent with future requests. If the server says the etag
hasn’t changed between requests then the response will have a 304 code and the cached resource will be used instead.
So let’s check to see if the resource has changed with cURL:
$ curl -I -H 'If-None-Match: "018b8b0ecb632aab770af328f043b119-ssl"' https://www.thepolyglotdeveloper.com/css/custom.min.css
HTTP/2 304
date: Wed, 04 Sep 2019 00:53:24 GMT
etag: "018b8b0ecb632aab770af328f043b119-ssl"
cache-control: public, max-age=0, must-revalidate
server: Netlify
x-nf-request-id: eca29310-c9bf-4742-87e1-3412e8852381-2165939
With the new request to the same resource, the If-None-Match
header is included and the value is the etag
hash from the previous request.
Notice that this time around the response status code was 304 as anticipated. Had the etag
been different, a 200 response would have happened with a new etag
hash.
If you look at your browser’s network inspector you might notice that etag
hashes for resources have a -df
value appended to them. For example, for the same resource, my browser is showing the following:
018b8b0ecb632aab770af328f043b119-ssl-df
While similar, it isn’t totally the same as the etag
hash that came back with the previous cURL requests. Try to run a cURL request with the above etag
value:
$ curl -I -H 'If-None-Match: "018b8b0ecb632aab770af328f043b119-ssl-df"' https://www.thepolyglotdeveloper.com/css/custom.min.css
HTTP/2 200
accept-ranges: bytes
cache-control: public, max-age=0, must-revalidate
content-length: 7328
content-type: text/css; charset=UTF-8
date: Wed, 04 Sep 2019 01:03:13 GMT
strict-transport-security: max-age=31536000
etag: "018b8b0ecb632aab770af328f043b119-ssl"
age: 0
server: Netlify
x-nf-request-id: 2734ffab-c611-4fc9-841e-460f172aa3b4-1604468
The response was not a 304 code because the -df
means that it is a compressed version of the URL. As it stands, our cURL requests have been for uncompressed versions of the URL.
A Support Engineer at Netlify pointed this difference out to me in this forum thread.
In most circumstances the web browser will include the appropriate header information to work with compressed resources, so in cURL we have to do something different.
To get beyond this with cURL, the following request would work:
$ curl --compressed -I -H 'If-None-Match: "018b8b0ecb632aab770af328f043b119-ssl-df"' https://www.thepolyglotdeveloper.com/css/custom.min.css
HTTP/2 304
date: Wed, 04 Sep 2019 01:07:36 GMT
etag: "018b8b0ecb632aab770af328f043b119-ssl-df"
cache-control: public, max-age=0, must-revalidate
server: Netlify
vary: Accept-Encoding
x-nf-request-id: 65a8e1aa-03a0-4b6c-9f46-51aba795ad83-1301670
Notice in the above request that we’re now using the --compressed
flag with cURL. As a result, we get a 304 response indicating that the resource hasn’t changed and we should used the locally cached copy.
Alternatively, we could execute the following cURL request:
$ curl -I -H 'If-None-Match: "018b8b0ecb632aab770af328f043b119-ssl-df"' -H 'Accept-Encoding: gzip, deflate, br' https://www.thepolyglotdeveloper.com/css/custom.min.css
HTTP/2 304
date: Wed, 04 Sep 2019 01:12:34 GMT
etag: "018b8b0ecb632aab770af328f043b119-ssl-df"
cache-control: public, max-age=0, must-revalidate
server: Netlify
vary: Accept-Encoding
x-nf-request-id: eca29310-c9bf-4742-87e1-3412e8852381-2432816
Instead of using the --compressed
flag, we are including an accept-encoding
header.
Again, information around compressed versions were provided to me by Luke Lawson from Netlify in this forum thread.
You just saw how to simulate the same caching that happens in the web browser using cURL instead. Since I’m new to content delivery networks (CDN) and how they handle caching, this was very useful to me when it came to testing how caching worked with the etag
hash for any given resource. The 304 response will always be received quicker and with a smaller payload than a 200 response which saves bandwidth and performance without sacrificing the freshness of content.
In theory, the CDN will maintain versioning information for a given resource and as a result will be able to validate etag
values for freshness. It is not up to the browser to determine if the etag
is stale.