Read Using ETags to Reduce Bandwith & Workload with Spring & Hibernate.
Gavin created a sample app based on Spring's "petclinic" bundled with his eTag caching framework. Gavin uses a Spring MVC HTTP Request interceptor to apply ETag comparison logic to find if the data that is used to build a page hasn't changed, to avoid further processing. For the purposes of this article, he created a simple ModifiedObjectTracker that keeps track of insert, update and delete operations via Hibernate event listeners. The tracker keeps a unique number for each view in the application, and a map of what Hibernate entities impact each view. Whenever a POJO is changed a counter is incremented for the views that the entity is used in. Gavin uses the count as the ETag, so when the client sends it back we know if one of the objects behind the page has been modified.
Have you tried using ETags to implement a content page caching framework?
Community comments
cool, this should be put into a framework
by Floyd Marinescu,
Re: cool, this should be put into a framework
by Jerome Louvel,
Re: cool, this should be put into a framework
by Kishore Senji,
Re: cool, this should be put into a framework
by Kishore Senji,
Re: cool, this should be put into a framework
by Gavin Terrill,
Re: cool, this should be put into a framework
by hank jmatt,
Why not use version fields?
by Jason Carreira,
Re: Why not use version fields?
by Gavin Terrill,
Re: Why not use version fields?
by Jason Carreira,
If-Modified-Since and If-None-Match are overlapping
by Carl-Erik Kopseng,
Questionable Design
by Subbu Allamaraju,
Re: Questionable Design
by Gavin Terrill,
cool, this should be put into a framework
by Floyd Marinescu,
Your message is awaiting moderation. Thank you for participating in the discussion.
It would be cool to see a generalized etag caching framework added to some of today's modern Java webframeworks. Thanks Gavin!
Why not use version fields?
by Jason Carreira,
Your message is awaiting moderation. Thank you for participating in the discussion.
Your synchronized block that every update has to run through is going to be a bottleneck quickly. Why not use a version column on your domain models as your etag? Turn on optimistic concurrency control in Hibernate and it will update the version for you. Sure, you have to load your object up before you check the token, but if you're using a good cache, it shouldn't be that big a deal, right?
I'm not clear, though, on what benefit this gives over just using the If-Modified-Since field?
Questionable Design
by Subbu Allamaraju,
Your message is awaiting moderation. Thank you for participating in the discussion.
It is an interesting to attempt to come up with a generic approach towards caching of dynamic apps, but I think the design approach to achieve that is questionable. With the approach suggested here, every request would make the app execute the request just to determine of the etag value is still relevant, and in most cases, this step itself may consume a significant number of CPU cycles.
Re: Why not use version fields?
by Gavin Terrill,
Your message is awaiting moderation. Thank you for participating in the discussion.
Hi Jason,
Thanks for your comments. I'll check the synchronized performance issue.
Re using a version column: Yes, that would be a fine approach. I chose to use a count per view in the example because I have found that the view often uses multiple domain objects (meaning you need to take into account multiple version numbers). I suspect that in a full blown implementation this would be a good place to apply the Strategy pattern so that the the most appropriate mechanism is used depending on the data in the model.
Re If-Modified-Since: My concern here would be around problems in a clustered environment - you would need to ensure the time is synchronized accurately.
Re: cool, this should be put into a framework
by Jerome Louvel,
Your message is awaiting moderation. Thank you for participating in the discussion.
Java already has Restlet, a complete and dedicated REST framework, which can be used either as a standalone API or inside a Servlet container.
Restlet already has an advanced support for E-Tags. You just need to expose the tag of your representations/variants and the Restlet engine will take care of setting the status with conditional requests.
See the Variant and Tag classes:
www.restlet.org/documentation/1.0/api/org/restl...
www.restlet.org/documentation/1.0/api/org/restl...
Restlet home:
www.restlet.org
Re: Why not use version fields?
by Jason Carreira,
Your message is awaiting moderation. Thank you for participating in the discussion.
For replacing the synchronized block, if you want to maintain that structure, I'd look at AtomicInteger and its incrementAndGet() method. It should allow your data structure to be lock-free.
Re: cool, this should be put into a framework
by Kishore Senji,
Your message is awaiting moderation. Thank you for participating in the discussion.
Nice article.
Why cannot we have the same approach as the ETagInterceptor done in ETagFilter though as a Filter is nothing but an Interceptor. The Filter should be a OncePerRequestFilter though as you would not want the filter to be processed for all includes (in Servlet spec 2.3 as there are no <dispatcher> rules, containers apply filters differently). And if we use OncePerRequestFilter, the shouldNotFilter is exactly similar to your preHandle method and so, we can do the second approach with the Filter as well without even computing a MD5 hash on the content.
Please note that /*.htm is not a valid url-pattern. (Some containers might accept this, WebSphere I know does not accept that pattern). Either /* (a path mapped) or *.htm (extension mapped) is supported and not both.
This approach is not only good for serving html content but often used for images, js, css and rss content.
There is a bug in Internet Explorer which does not send ETag headers (atleast 6.0) for gzipped content. So, using If-Modified-Since may be a better approach for serving gzipped html/js/css/rss etc. However you raised a concern about synchronizing it across clusters. But I would think it would be the same issue with a ETag number as well - probably using a distributed cache like ehcache.
Re: cool, this should be put into a framework
by Kishore Senji,
Your message is awaiting moderation. Thank you for participating in the discussion.
Also, isn't setStatus() preferred over sendError() for sending 304 status. The sendError I think would show the error page configured on the web-app.
Re: Questionable Design
by Gavin Terrill,
Your message is awaiting moderation. Thank you for participating in the discussion.
Hi Subbu.
If you look at this (the interceptor approach) from the perspective of a single request there is definitely overhead, and you are right - if it takes longer to validate the ETag than regenerating the content it would be simpler to not bother with ETags in the first place. You also need to consider the nature of the application. I think the sweet spot for this approach is applications where the typical user usage is 80/20 reading versus writing, with users returning to the same pages. Rather than speculate though, the key is to measure before and after you implement. I'm planning on posting some numbers over the weekend that show the impact of adding these filters and interceptors to petclinic.
Thanks for commenting!
Re: cool, this should be put into a framework
by Gavin Terrill,
Your message is awaiting moderation. Thank you for participating in the discussion.
Hi Kishore,
I considered using a filter, but didn't understand what the benefits would be over an interceptor. OTOH, I can inject beans into the interceptor.
Thanks for the tip on the url-pattern.
Gavin.
Re: cool, this should be put into a framework
by hank jmatt,
Your message is awaiting moderation. Thank you for participating in the discussion.
I chose to use a count per view in the example because I have found that the view often uses multiple domain objects. online game I suspect that in a full blown implementation this would be a good place to apply the Strategy pattern so that the the most appropriate mechanism is used depending on the data in the model.
If-Modified-Since and If-None-Match are overlapping
by Carl-Erik Kopseng,
Your message is awaiting moderation. Thank you for participating in the discussion.
The use cases for If-Modified-Since and If-None-Match are overlapping. If you are using one, then there is no point in using the other.