I have recently run into several Cache-Control issues within some web pages here at work. I wanted to document how the issue has presented itself and how the issues have been resolved so that anyone encountering the issues elsewhere could quickly resolve.
Caching is a well known concept in computer science: when programs continually access the same set of instructions, a massive performance benefit can be realized by storing those instructions in RAM. This prevents the program from having to access the disk thousands or even millions of times during execution by quickly retrieving them from RAM. Caching on the web is similar in that it avoids a roundtrip to the origin web server each time a resource is requested and instead retrieves the file from a local computer’s browser cache or a proxy cache closer to the user.
The Caching issue usually presents itself with a end user stating the page is not taking my updates or that after my update this popup continues to appear even after I have left the page. These don’t directly indicate that a caching issue exists further investigating would be needed. If the page in question really did make the update, then you might have a caching issue. Additionally, some programmers set-up popup dialogs based on events. Another way the issue has presented itself to me was a html popup window that will not close. My theory is, that after a popup occurs let’s say, the Modal Pop-up dialog (“check has been successfully deleted”) then JavaScript must have incremented a variable that is being tested to turn off the Pop-up. The problem that occurs is that page posts back to itself and on the end user’s computer it receives the “cached” version from the hard drive and NOT the web server. Therefore the JavaScript variable is therefore the previous setting that does not turn off the Pop-up.
Without doing anything extra to any web page the page is generally allowed to be cached by most browsers. generally speaking you actually have to do extra work to not allow a page to be cached. I’ll get to this extra work here in a second but first I want to mention that a server can be set up so that inverse of this is true. On a IIS server you can change the default IIS Settings to add a custom Cache-Control header to a hierarchy of sites or folders. For example at the root level of your web site you could add the Cache-Control header and therefore all folders above this one and all the web pages will include a “Cache-Control:no-cache” (picture below).
For a very long time our servers had a header placed at the root to not allow caching of any web page with a custom header and we specifically set some web pages to allow caching where it was needed. We simply removed the headers from certain pages after the more global root setting was made. The reasons some pages can not contain a no-cache setting is that they actually require the ability to be cached. For example, you can not prevent caching on an “application/octet-stream” download on a SSL encrypted site. Therefore a couple of pages that produced secure adobe files had to be marked without any Cache-Control headers by visiting this node or file on the hierarchy and specifically deleting the header. This is easily done however IIS server 5.0 appears to reset to the root settings from time to time. I think some IIS server updates cause this reset. At any rate our company has now encrypted files written to the laptop hard drives and I by default allow caching mostly anything from our IIS servers by default. In real geek speak most every page browsed today is a “response 304” that indicates that the file in the cache is valid. it really depends on the user’s browser setting if the cached version is used (follow along)..
On different browsers the end user has some control over their computers ability to use a cached version of web pages. This option exists in Internet Explorer in Tools/ Internet Options / General/ Settings (Internet Explorer 8.0 shown below). Actually deleting the users cached files could resolve the immediate issue however a far better alternative is to have the server not allow the page to be cached in the first place.
Where and why would you actually want a page to never be cached? I’m Glad you finally asked..
A page that’s method is to “get” or “post” back to itself with a saved set of changes in general should be cached-controlled with a no-cache option. By this I mean this page should never be cached in the first place! The obvious thing that is noticed when a page does not have this no-cache option is that the page did not take your changes or at least it appears that way to the end user. On the post-back the cached page is display by the browser instead of waiting for the round trip to complete. This might also be especially true of compressed pages. So the end user winds up seeing the page that was originally display to them upon the first visit without the updated fields. You ask why would the programmer have not seen the same thing in testing. Testing might have been done on a separate box with different settings and additionally the programmer might have different browser settings for the temporary storage of files (cache) or finally the server admin simply turned on IIS compression recently and the page now caches differently.
How might you actually make the changes to web page that have a Cache-Control issue?
If the page in question has a .aspx page with a header I would make the three meta tag addition in red below.
<%@ Page Language="VB" AutoEventWireup="false" CodeFile="Default.aspx.vb"
Inherits="_Default" %>
<%@ Register Assembly="HtmlEditor" Namespace="HtmlEditor" TagPrefix="cc1" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" >
<head runat="server">
<title>HR Self-Serv Contenet Manager</title>
<META Http-Equiv="Cache-Control" Content="no-cache">
<META Http-Equiv="Pragma" Content="no-cache">
<META Http-Equiv="Expires" Content="0">
<link href="StyleSheet.css" rel="stylesheet" type="text/css" />
</head>
....
If it could not be done in the .aspx page but could be done in the .vb page or otherwise need done in compiled dll I would simply include the dot net directive SetCacheability as shown in code below:
If myReader.Read Then
Response.Clear()
Response.BufferOutput = False
Response.ContentType = "text/html"
Response.Cache.SetCacheability(HttpCacheability.NoCache)
Response.Write("<link href='StyleSheet.css' rel='stylesheet ….
…
You could go nuts with all the different headers that potentially modify the page’s cacheability such as these:
Response.ClearHeaders();
Response.AppendHeader("Cache-Control", "no-cache"); //HTTP 1.1
Response.AppendHeader("Cache-Control", "private"); // HTTP 1.1
Response.AppendHeader("Cache-Control", "no-store"); // HTTP 1.1
Response.AppendHeader("Cache-Control", "must-revalidate"); // HTTP 1.1
Response.AppendHeader("Cache-Control", "max-stale=0"); // HTTP 1.1
Response.AppendHeader("Cache-Control", "post-check=0"); // HTTP 1.1
Response.AppendHeader("Cache-Control", "pre-check=0"); // HTTP 1.1
Response.AppendHeader("Pragma", "no-cache"); // HTTP 1.1
Response.AppendHeader("Keep-Alive", "timeout=3, max=993"); // HTTP 1.1
Response.AppendHeader("Expires", "Mon, 26 Jul 1997 05:00:00 GMT"); // HTTP 1.1
However I would stick to the two examples above. Also the program fiddler could be used to verify that a 304 Not-Modified response header is not being returned, 304 indicates that the file in the cache is valid and can be used again as opposed to getting a 200 response.