I have finalized a small PHP application that can serve many documents. These documents must be cacheable by clients and proxies.

Since proxies can cache my results I must be extra careful because the documents I serve can have different MIMEs types (content negotiation based on $_SERVER['HTTP_ACCEPT']) and different languages (based in this order: $_POST value / $_GET value / URL / PHP session value / $_COOKIE value / $_SERVER['HTTP_ACCEPT_LANGUAGE'] / default script value).

To shortly sum up, a page can be served with many MIME type and many languages with the same URL (question changed: see edit below).

To help cache on proxies I use the "Vary: Accept" header in combination with the ETag header. The ETags is a MD5 of the current language and the last modified timestamp.

I always:

  • Send an Expires header
  • Send a Cache-Control header
  • Send a Last-Modified header
  • Send a Content-Type header
  • Send an ETag header (based on current language and Last-Modified timestamp)
  • Send a Content-Language
  • Send a "Vary: Accept" header if the document is XHTML

Now with my question: is this enough to help cache on proxies and clients? Did I miss a thing/header?

To help you, here’s the HTTP response header for a test page (on my local environment):

"
Date             Wed, 30 Dec 2009 18:56:26 GMT
Server           Apache/2.0.63 (Win32) PHP/5.1.0
X-Powered-By     PHP/5.1.0
Set-Cookie       Tests=697daqbmple2e1daq2dg74ur96; path=/
Expires          Wed, 30 Dec 2009 21:56:26 GMT
Cache-Control    public, max-age=10800
Last-Modified    Mon, 28 Dec 2009 15:11:49 GMT
Etag             "44fa50be4638161a596e4b75d6ab7a94"
Vary             Accept
Content-Language en-us
Content-Length   3043
Keep-Alive       timeout=15, max=100
Connection       Keep-Alive
Content-Type     application/xhtml+xml; charset=UTF-8
"

EDIT: OK I understand that in this case serving a document with many MIMEs and having different languages (that can come from so many sources - see above) is just plain bad design. If you want to do this just use "private" cache (no cache on proxies)... Am I correct?

If each language have it's own URL (but each URL can be served with many MIME still) is my current implementation is OK for a "public" cache (cache on clients + proxies)?

Accepted Answer

Since your output also depends on things a proxy cannot know like session data, won't it be easier to send a (non-cachable) redirect to the actual content, which would be fixed for a given URL (with parameters) and therefore much easier to cache. I know this involves an extra round-trip, but it's probably much less error-prone and would also cause less problems with proxies that don't completely understand/support all your header combinations.

Also, I'm guessing that, if you have two clients going through the same proxy but with different language cookies, your current method would return two different ETags for the same URL, which would make the proxy update its copy each time it sees the other client.

Written by Wim
This page was build to provide you fast access to the question and the direct accepted answer.
The content is written by members of the stackoverflow.com community.
It is licensed under cc-wiki