Pieter van Zyl on Benchmarking ORM Tools and Object Databases
OO7J is a Java version of the original OO7 benchmark (written in C++) from Mike Carey, David DeWitt and Jeff Naughton at the University of Wisconsin-Madison. The original benchmark tested Object Databases (ODBMS) performance. This project also includes benchmarking Object Relational Mapping (ORM) tools. Currently there are implementations for Hibernate on PostgreSQL, MySQL, db4o and Versant databases.
InfoQ & R. Zicari: Please give us a summary of OO7J research project.
Pieter van Zyl: The study investigated and focused on the performance of object persistence and compared ORM tools to object databases. ORM tools provide an extra layer between the business logic layer and the data layer. This study began with the hypothesis that this extra layer and mapping that happens at that point, slows down the performance of object persistence. The aim was to investigate the influence of this extra layer against the use of object databases that remove the need for this extra mapping layer. The study also investigated the impact of certain optimisation techniques on performance.
A benchmark was used to compare ORM tools to object databases. The benchmark provided criteria that were used to compare them with each other. The particular benchmark chosen for this study was OO7, widely used to comprehensively test object persistence performance. Part of the study was to investigate the OO7 benchmark in greater detail to get a clearer understanding of the OO7 benchmark code and inside workings thereof.
Because of its general popularity, reflected by the fact that most of the large persistence providers provide persistence for Java objects, it was decided to use Java objects and focus on Java persistence. A consequence of this decision is that the OO7 benchmark, currently available in C++, has had to be re-implemented in Java as part of this study.
Included in this study was a comparison of the performance of an open source object database, db4o, against a proprietary object database, Versant. These representatives of object databases were compared against one another as well as against Hibernate, a popular open source representative of the ORM stable. It is important to note that these applications were initially used in their default modes (out of the box). Later some optimisation techniques were incorporated into the study, based on feedback obtained from the application developers. My dissertation can be found here.
InfoQ & R. Zicari: Please give us a summary of the recommendations of the research project.
Pieter: The study found that:
- The use of an index does help for queries. This was expected. Hibernate's indexes seem to function better than db4o's indexes during queries.
- Lazy and eager loading was also investigated and their influence stood out in the study. Eager loading improves traversal times if the tree being traversed can be cached between iterations. Although the first run can be slow, by caching the whole tree, calls to the server in follow-up runs are reduced. Lazy loading improves query times and unnecessary objects are not loaded into memory when not needed.
- Caching was investigated. It was found that caching helps to improve most of the operations. By using eager loading and a cache, the speed of repeated traversals of the same objects is increased. Hibernate with its first level cache in some cases performed better than db4o during traversals with repeated access to the same objects.
When creating and using benchmarks it is important to clearly state what settings and environment is being used. In this study it was found that:
- Running in client-server mode or embedded mode in db4o has different performance results. It was shown that some queries are faster when db4o is run in embedded mode and also that some traversals are faster when db4o is run in client-server mode.
- It is important to state when a cache is being used and cleared. It was shown that clearing the Versant cache with every transaction commit influenced the hot traversal times. It is also important for example, to state if a first or second level cache is being used in Hibernate as this could influence traversal times.
- Having a cache does not always improve random inserts, deletes and queries. It was shown that the cache assisted traversals more than inserts, deletes and queries. Because of random inserts, deletes and queries not all objects accessed will be in the cache. Also if query caches are used, it must be stated clearly, otherwise it could create false results.
- It is important to note that it is quite difficult to generalise. One mechanism is faster with updates to smaller amount of objects, others with larger amount of objects; some perform better with changes to index fields others with changes to non indexed fields; some perform better if traversals are repeated with the same objects. Others perform better in first time traversals which might be what the application under development needs. The application needs to be profiled to see if there is repeated access to the same objects in a client session.
- In most cases the cache helps to improve access times. But if an application does not access the same objects repeatedly or accesses scattered objects then the cache will not help as much. In these cases it is very important to look at the OO7 cold traversal times which access the disk resident objects. For cold traversals, or differently stated, first time access to objects, in most cases Versant is the fastest of all the mechanisms tested by OO7.
- By not stating the benchmark and persistence mechanisms settings clearly it is very easy to cheat and create false statements. It is easy to cheat if certain settings are not brought to light. For example with Versant it is important to state what type of commit is being used: a normal commit or a checkpoint commit. Also what types of queries are used can impact benchmark results. Also there is a slight performance difference in using a find() vs. running a query in Hibernate.
- It is important to make sure that the performance techniques that are being used actually improve the speed of the application. It has been shown that to add subselects to a MySQL database does not automatically improve the speed.
This work formed part of my MSc. While the findings are not always surprising or new, the work showed that you could use the OO7 benchmark still to test today's persistence frameworks. It really brought out performance differences between ORM tools and object databases. This work is also the first OO7 implementation that tested ORM tools and compared open source against commercial object databases.
InfoQ & R. Zicari: What is the current state of the project?
Pieter: The project has implementations for db4o, hibernate with PostgreSQL and MySQL and the Versant database. The project currently works with settings files and Ant script to run different configurations. The project is a complete implementation of the original OO7 C++ implementation. More implementations will be added in the future. I also believe that all results must be audited. I will keep submitting benchmark results to vendors.
InfoQ & R. Zicari: What are the best practices and lessons learned in the research project?
Pieter: What is interesting today is that bench-markers are still not allowed to publish benchmark result of commercial products. Their licensees prohibit it. We felt that academics must be allowed to investigate and publish their results freely. In the end we did comply with the licenses and submitted the work to the vendors.
InfoQ & R. Zicari: Do you see a chance that your benchmark will be used by the industry? Why?
Pieter: Yes, but I suspect they are using benchmarks already. These benchmarks are probably home grown. Also there are no de-facto benchmarks for object database and ORM tool vendors. There exists a TPC benchmark for relational database vendors. While some vendors did use the OO7 benchmark in the late 90s they seem to not use it any more or maybe they have adjusted for in-house use.
OO7J could be used to test improvements from one version to the next. I have used it to benchmark differences between different db4o releases. We use tested embedded versions of db4o with the client-server version of db4o and this gave us valuable information and we could discern the differences in performance.
Currently OO7J has its own interface to the persistence store being benchmark. This means that it can be extended to test most persistence tools. We wanted to use the JPA or JDO interfaces but not all vendors support these standards.
InfoQ & R. Zicari: What is the feedback did you receive so far?
Pieter: The dissertation was well received. I got a distinction for the work. I submitted the benchmark to the vendors to get their input on the benchmark and how to optimize their products. The feedback was good and no bugs were found. It is important that a benchmark is accurate and used consistently for all vendor implementations. I don't think there are any funnies or inconsistencies in the benchmark code.
Jeffrey C. Mogul states that it is important that benchmarks should be repeatable, relevant, use realistic metrics, be comparable and widely used. I think OO7 complies with those requirements and I stayed as close as possible to OO7 with OO7J.
Also OO7J has been used by students at ETH Zurich - Department of Computer Science. Another object database vendor in America also contacted me about my work and wanted to use it for their benchmarking. Not sure how far they progressed.
InfoQ & R. Zicari: What are the main related works? How does OO7J research project compare with other persistence benchmarking approaches and what are the limitations of the OO7J project?
Pieter: There have been related attempts to create a Java implementation of OO7 in the late 90s by a few researchers. Sun also created a Java version. These versions are not available any-more and weren't open sourced. See my dissertation for more details.
More recent work includes:
- ETH Zurich CS department created a Mavenized and GUI version of my work but include changes to the database and they need to sort out some small issues.
- Prof William R. Cook students effort can be found here and here.
Other benchmarking work in the Java object space:
These benchmarks are not entirely vendor independent. But they are open source and one can look at the code and challenge their coding.
I think OO7 has one thing going for it that the others don't have: I still think it is more widely used. Especially in the academic world. It has a lot of vendor independence behind it historically. It has had more reviews and documentation on how it works internally.
But I have seen some implementation of OO7 that are not complete: they for example build half the model and then don't disclose these changes when publishing the results. Or only have some of the queries of traversals working.
That is why I like to stay close to the original well known OO7. I document any changes clearly.
If you run Query 8 of OO7 I want to expect that it functions 100% like the original. If anyone modifies it they should see this as an extension and rename the operation.
I have also included asserts/checkpoints to make sure the correct number of objects are returned for every operation.
Limitations of OO7J:
- It needs to be upgraded to run in client/server mode. More concurrent clients must be created to run operation on the databases at the same time.
- Its configurations need to be updated to create tera and peta byte database models.
InfoQ & R. Zicari: What still needs to be done?
- Currently there are 3 implementations for each of the products being tested. While it uses the same model and operation I found that parent objects must be saved before children objects in the Hibernate version. Also the original OO7 also had an implementation per product benchmarked. I want to create one code base that can switch between different products using configuration. The ETH students have attempted this already but I am thinking of a different approach.
- Configurations for larger datasets
- More concurrent clients
- Investigate if I could use OO7 in a MapReduce world.
- Investigate if OO7 can be used to benchmark column-oriented databases
- Include Hibernate+Oracle combination.
InfoQ & R. Zicari: NoSQL/NRDBMS solutions are getting lot of attention these days. Are there any plans to do a persistence performance comparison of NoSQL persistence frameworks in the future?
Pieter: Yes, they will be incorporated. I still believe object databases are well suited to this environment. Still not sure that people are using them in the correct situations. I sometimes suspect people jump on to a hot technology without really benchmarking or understanding their application needs.
InfoQ & R. Zicari: What is the future road map of your research?
Pieter: Investigate clustering, caches, MapReduce, column-oriented databases and investigate how to incorporate these into my benchmarking effort.
I would also love to get more implementation experience either with a vendor or building my own database.
Final note to the interview:
"Too often I've seen designs used or rejected because of performance considerations, which turn out to be bogus once somebody actually does some measurements on the real setup used for the application"
- Martin Fowler, Patterns of Enterprise Application Architecture
I believe that one should benchmark before making any technology decisions. People have a lot of opinions of what performs better but there is usually not enough proof. There is a lot of noise in the market. Cut through it and benchmark and investigate for yourself.
About the Author:
Pieter van Zyl is a researcher at the Meraka Institute of South Africa's Council for Scientific and Industrial Research (CSIR). His research focuses on object persistence mechanisms (ORM tools and object databases), with a specific focus on performance benchmarks. He is part of the Espresso research group at the University of Pretoria and a maintainer of the open source performance benchmark project PolePosition on Sourceforge.
For Additional Reading
- Performance investigation into selected object persistence stores, Dissertation, Pieter van Zyl, University of Pretoria.
- OO7J benchmark, Pieter van Zyl and Espresso Research Group.