O/R Mapping, Caching, and Performance

According to Frans Bouma, one of the common misconceptions about Object/Relational Mapping (O/R Mapping) frameworks is that they give developers caching for free and that caching improves performance. While O/R Mapping frameworks do rely on caching, improved performance isn't in the cards.

Essentially caching improves performance by eliminating calls to the database. Database calls tend to be orders of magnitude more expensive that retrieving the information locally.

Caching is very important for most high performance applications. Without caching, an application under heavy loads can grind to a halt. However caching doesn't automatically improve performance. The right information has to be cached.

The problem with O/R Mapping is that the full set of records is almost never in the cache. So if multiple records are requested based on some search criteria, there is no way to know if they are all in the cache. Frans Bouma continues...

This thus causes a roundtrip and a query execution on the database. As roundtrips and query executions are a big bottleneck of the complete entity fetch pipeline, the efficiency the myth talks about is nowhere in sight. But it gets worse. With a cache, there's actually more overhead. This is caused by the uniquing feature of a cache. So every entity fetched from the database matching the query for the customers has to be checked with the cache: is there already an instance available? If so, update the field values and return that instance, if not, create a new instance (but that's to be done anyway) and store it in the cache.

Caching does serve a purpose for O/R Mapping frameworks, it solves the problem of uniqueness. Normally multiple calls to the database for the same information results in having multiple objects with the same data. Usually this is acceptable.

However sometimes it can be a problem or an inconvenience. When that happens, it's good that there's a way to have unique objects per entity loaded. Most O/R mappers use a cache for this: when an entity is loaded from the database, the cache is consulted if there's already an entity object with the entity data of the same entity fetched. If that's the case, that instance is updated with the data read from the database, and that instance is returned as the object holding the data. If there's no object already containing the same entity, a new instance is created, the entity data fetched is stored in that instance, that instance is stored in the cache and the instance is returned. This leads to unique objects per entity.

InfoQ Software Architects' Newsletter

Follow us on

Rate this Article

This content is in the Infrastructure topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter