BT

You Are Using the ORM the Wrong Way

| by Jan Stenberg Follow 38 Followers on Sep 16, 2014. Estimated reading time: 2 minutes |

When teams abandons an Object-Relational Mapper, ORM, because they think it doesn’t perform well or adds too much magic, then it is often due to bad usage Jimmy Bogard states in a recent presentation highlighting what he sees as correct and incorrect ways of using an ORM, including mapping and querying problems.

Jimmy, creator of AutoMapper and a Microsoft MVP, describes an ORM as a tool for getting information from a database to an application and back again, which may look like an easy problem but turns out to be quite complicated.

One mapping problem Jimmy describes is database-generated mapping code where a tool creates the code from an existing database. This may look attractive but he typically finds too many relationships and extraneous navigations that will give performance problems. Instead Jimmy uses code-first, adding mapping and relations as they are needed, even for an existing database.

In Jimmy’s experience, one of the first things developers tend to complain about when using an ORM is excessive lazy loading and Select N+1 loading, features that will make an ORM delay loading all data in a complex model, instead reading data as needed. The problem is that it may result in many calls to the database, e.g. each time reading a property or when looping over a collection. Instead Jimmy prefers to use eager fetch as much as possible, reading all the data needed in one request.

Repository is a pattern that Jimmy thinks is a bad idea. In the original DDD book repositories was intended to be an interface that looks like a collection on top of a data store but Jimmy believes the pattern has morphed into a façade pattern over an ORM that hides many ORM-specific features. This makes the repository just a dumb interface with an implementation that is just a delegation to the actual ORM. For the ORM to be used effectively these hidden features still need to be used which will make implementation details leak up into the application making repositories nothing more than fancy wrappers on top of ORM’s.

Instead of repositories Jimmy prefer to model each data request as a command or a query moving all code for dealing with a specific request into one class. Changes e.g. a new data access strategy is thus encapsulated in one specific class.

One of the final bad ideas Jimmy mentions is SQL ignorance. ORM is not a way to avoid knowing SQL, for business critical systems running on top of an ORM it’s crucial to know what SQL is being run and how it performs.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Agreed... by Richard Richter

However, it is not easy to do it the right way (like with many things after all). Some people believe JPA will save them, their time, protect them from SQL (geez), etc. That is not so. I've been using ORM/JPA for years and still have to learn something new. Every time I annotate a class I have to review things I probably forgot already. There are great books (for JPA I can recommend "Pro JPA 2..."), but JPA is not the only technology you're using in your stack and there are tons of good books for other areas. Hence - we often use JPA without knowing it all. Many developers just slap some annotation on a class and they are happy, never checking generated SQLs, etc. But even those who try sometimes suffer, studying a problem for a couple of hours and then adding tiny attribute "orphanRemoval" that fixes it in a second. ORM is hard stuff, let's not be mistaken here.

Good one! by Oliver Gierke

I think Jimmy has really good points here. I liked the section on why generating data access code from the database is a bad idea and more generally that all his arguments are effectively pointing into fundamental DDD aspects of making domain concepts explicit. I totally disagree on the section on repositories, but that's probably not to surprising :).

A lot of people misuse ORMs because they think they don't need to think. This is universally wrong with any technology. Interestingly, a colleague of mine brought up a quote yesterday which we couldn't find the original source easily (if you happen to know, please let me know):

"As a developer, you need to understand at least one level of abstraction beneath the one at which you're working. Otherwise it's all magic and you're being irresponsible."

So if you happen to work with an ORM, you still need to know SQL. If you happen to work with JPA, you still need to understand Hibernate. The abstractions don't exist to make you think less, they exist to make the easy things easier or even allow to defer to dig down into some complexity area if needed. If you need very simple queries only, working with a query language abstraction is fine. If the SQL spit out by the ORM becomes a bottleneck (both in terms of what you want to get or in performance), getting your hands dirty with SQL is the completely right thing to do.

I think the biggest issue most people run into is not clearly defining aggregates - natural boundaries that define atomicity of changes, the scopes for loading data etc. If you get that right, an ORM can be a useful tool to map these aggregates to a relational database. But as both the OO side of things and the relational world offer much more flexibility, people tend to assume that they can bridge these two worlds while retaining flexibility on both sides. So they start to create endlessly connected type structures or tremendously normalized data models.

In that sense document oriented stores (MongoDB, Couchbase etc.) actually make it easier to map data structures to types as they already expose the concept of an aggregate in the stores. So by definition you have to differentiate between embedding a document or pointing to it. They make the concept of an aggregate explicit.

Isn't this already known? by peter lin

Anyone that has built big complex applications with ORM already know this first hand. You absolutely can't map based on the relational schema in the RDB. From the other side, you shouldn't ignore how the data is stored when you model your classes.

Far too many people don't think about how they model their schema and as a result the impedance mismatch is made worse. Over time, that mismatch becomes more painful. Having used ORMs on java and .Net platform, these problems still persist and many developers who claim to have experience with ORM really have no clue. Even little things like pros/cons of polymorphic queries aren't well understood by the developer community. Having built ORMs and studied several ORM's, it really does take an expert to use it properly for complex schemas/projects.

Re: Good one! by Olmo del Corral

I also think that not having defined aggregates is the biggest issue, but the problem is that most ORMs make a pair of properties for each FK.

In Signum Framework, an Order has a property of type Customer, but a Customer has an expression (extension method), of type IQueryable<Order>. This way is clear that Orders are not part of the customer, but related with the customer.

Is explained in signumsoftware.com/en/Framework in Modularity.

But I agree with Jimmy that Repositories and LINQ don't work well together. And if I've to choose I've no doubt: LINQ wins. What you do?

Be gone, repositories! by Jacek Gorgoń

I agree with the whole article but would like to stress the repository part. Introducing that layer on top of a decent, modern ORM leads to wasting time on either inefficient queries (because you need all ORM features in the layer above to make it efficient) or large wrappers exposing complete ORM functionality without any added benefit (try switching ORMs if you already rely on all the features of a given one).

Re: Good one! by Oliver Gierke

I'd argue, you shouldn't even let the Customer point back to the Order as this creates a cyclic dependency. The Customer is an abstraction useful on its own, the Order has to depend on the customer (whose order is this?). So essentially both of them are aggregates that have independent lifecycle and must not have a cyclic dependency by definition. To find a Customer's Orders you can still use the repository.

Re repositories: I am the project lead of a Java based library exposing a programming model for repositories [0]. I think the examples Jimmy gives are written in a way that they expose the wrongdoing in the first place. But you don't have to expose store specific API, query methods are *one* means to access a subset of entities (think of pre-filtered collections), but being able to just hand predicates into the repo is an option, too. So the repository is what it's supposed to be: an abstraction over a collection of aggregate roots. If you need one-off sub-collections, use a query method, if you need flexible predicate combination, use that. it's not either or, it's use the right tool for the problem at hand.

I totally lost it at the argument of testability as - of course - it's hard to mock that stuff, once you expose store specific API. But that's a fundamental flaw in the first place. It's much easier to mock a List<Order> findByCustomer(Customer customer) rather than a plethora of calls on a Queryable (or EntityManager in the Java world). Plus, repositories allow you to hand more advanced concepts like pagination to users as well.

Anyway, it's really good that the talk makes it's rounds currently as there's a lot of stuff in here that more people should be aware of and - as it's most often the case - awareness is the first step to improvement :).

[0] github.com/spring-projects/spring-data-jpa#quic...</order>

Don't throw the ORM (and Repository) baby by Guillaume L

I agree on most of this stuff except Repositories.

Just because the original pattern (which Jimmy himself seems to recognize was good) is misused doesn't mean that "Repository is a bad idea".

Things like returning IQueryables or a Save() method should never end up in a Repository, but this isn't the fault of the ORM pattern that they do, it's because of people's unawareness of the original concept. And, popularization of various Repository misimplementations by so-called industry "experts" and ORM vendors "best practices" papers over the years.

ORM is just about mapping back and forth between object model and relational model (which might or not include change tracking), it's a very simple idea. The whole paraphernalia that tool vendors added around that is superfluous and harmful, because programmers have started thinking this is the one way to do ORM, and let these extra features leak into their core data access abstractions and dictate their modelling processes.

Nothing in ORM states that this is the one way to do data access, yet ORM tools went ubiquitous and their many flaws and popularized misuses together with them. Which now leads to throwing the baby with the bathwater (not in this presentation, but there's a looming trend). Which is a bit sad.

Re: Good one! by Olmo del Corral

About Cyclic dependency: Don't worry there's no direct reference from Customer to IQueryable<Order>, just an extension method. Looks like an instance method but is really an static method. customer.GetOrders() is really GetOrders(customer).

Repositories: I think the whole idea comes to .Net from Javaland. But it doesn't make sense in .Net. Is not a about stringly-typed predicates, its about having all the power of SQL queries (Select, Where, Join, GroupBy, Skip, Take, SelectMany, ....) with a syntax that is more expressive and OO than SQL and with strongly-typed support, refactoring, autocompletion etc... And then loosing all this by encapsulating it in some IRepository. It's like people who buy a 8.000€ sofa and then put a plastic cover on top.

Properly mocking IQueryable<T> is just as hard as mocking a RDBMS. Why you mock IRepository and not the java.sql.Connection? Because the first one is really simple and the second one really complex.

In java is not a big problem because the ORM is not that expressive anyway, and all the interesting things happen in stringly-typed SQL variants (HSQL, JPQL,...) that are error-prone, so better hide them behind some repository class.

With LINQ there are no miss-spellings, no SQL-injection, and queries can be combined in a way that is really hard concatenating strings.

So please Javaland, stop convincing some young .Net devs of the value of IRepository, is like if Java devs taking advice from C++ devs about header files.

This completely misses the point of why ORMs are bad. by Adam Tankanow

Repositories are different from and more valuable than ORMs because they provide no implementation details. The purpose of a Repository is simply to tell the rest of the object model that the objects contained in the given Repository are persistent. This is good because it frees the rest of the object model from worrying about HOW the Objects are persisted.

The primary problem with ORMs is that you are explicitly connecting your object model and your persistence implementation. What do you do if you need to switch SQL vendors? What do you do if you find you need to change from relational to another persistence model? You should optimize your object model and your persistence implementation based on different values, e.g. code readability and reuse vs performance, so why are you connecting them?

Re: Don't throw the ORM (and Repository) baby by Jacek Gorgoń

ORM started as tool for just mapping back and forth. Nowadays, at least in .NET, which is the presenter's context, it's a complete data access layer. It comes with built-in, fully generic repositories (DbSets) and other goodies. And it's efficient when both developing and running when used properly (which the presentation is really all about).

The concept you're referring to is often called a Micro ORM. Such tools do have their own place where super high performance and micro optimizations are worth the added developer effort (both in developing and maintaining), but they don't seem to be among mainstream enterprise stack choices.

I wish more developers realised what SQL really is about... by Lukas Eder

In many cases, when your application is relational-model centric, the main reason to still use an ORM is writing CRUD productively, as single-row INSERTs / UPDATEs / DELETEs are tedious work.

SQL (and RDBMS), however, are not about single-row operations. They're about sets and set theory. This is only poorly reflected in most ORMs, which tend to operate on a more object-oriented domain, rather than on sets of tuples.

Once you start working with sets, you also start realising how powerful set transformation can be when using SQL (and possibly also functional features of your general-purpose language). If you think of data transformation as being stateless, functional, and declarative, you probably won't need an ORM any longer.

Re: I wish more developers realised what SQL really is about... by Olmo del Corral

LINQ is set based, functional and declarative, still is an ORM

Re: Good one! by Andy Hochstetler

Completely agree here in all respects! We use the Repository pattern to allow us to abstract both ORM and Service Agent calls behind a consistent interface. This allows the logic for compensating transactions across platforms and technologies to be centralized.

Most importantly, I think your comment about aggregates is probably the most important thing that I've found people getting wrong or not understanding completely, which leads to an impedance mismatch with the ORM.

Re: I wish more developers realised what SQL really is about... by Lukas Eder

LINQ has not yet offered any answers to how to implement bulk insert / update / merge statements, which are essential to set based (as opposed to row based) data transformation.

But yes, I agree that it offers some paradigm synthesis. I wonder if blending client and server logic in one statement is going to be effective in the long run (in a LINQ-to-SQL sense). I personally prefer a clean separation of SQL producing some stream of tuples to the client, and then possibly the client further transforming that stream.

Re: I wish more developers realised what SQL really is about... by Hermann Schmidt

+1 on that

Easy CRUD is the only real advantage of an ORM (JPA for instance) I see today. Storing a tree of objects with merge() in JPA is powerful and convenient.

When it comes to complex queries though, they don't add much value. I used constructor queries all over the place, which are nothing but a pimped version of a JDBC call. The whole machinery of the ORM is basically useless there.

I told my developers to not fall into the object graph navigation trap and think in set operations, because that how it is meant to be on top of an RDBMS.

Just use MongoDB by Jean-Jacques Dubray

If an ORM is a great fit for your application, MongoDB is generally a better fit.

Re: Just use MongoDB by Jonathan Allen

I have to agree. While I'm no fan of MongoDB, if you want that style of data interaction then you might as well use a database that natively supports it. Using an ORM to pretend that a relational database is really an OO database is just asking for trouble.

But then again, relational databases are so crazy fast that most developers won't actually be impacted by the performance disasters that they are creating.

Re: Just use MongoDB by Jean-Jacques Dubray

>> Using an ORM to pretend that a relational database is really an OO database is just asking for trouble.

My point exactly!

I perfectly agree by Mauro Molinari

Nice to see something else whose thought is exacly mine! :-) Especially on the Repository pattern and SQL parts.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

19 Discuss
BT