BT

Spring Data – One API To Rule Them All?

Posted by Tobias Trelle on Aug 16, 2012 |

Spring Data is a high level SpringSource project whose purpose is to unify and ease the access to different kinds of persistence stores, both relational database systems and NoSQL data stores.

With every kind of persistence store, your repositories (a.k.a. DAOs, or Data Access Objects) typically offer CRUD (Create-Read-Update-Delete ) operations on single domain objects, finder methods, sorting and pagination. Spring Data provides generic interfaces for these aspects (CrudRepository, PagingAndSortingRepository) as well as persistence store specific implementations.

You may have already used one of the Spring template objects (like JdbcTemplate) to write your custom repository implementations. While the template objects are powerful, we can do better. With Spring Data’s repositories, you need only to write an interface with finder methods defined according to a given set of conventions (which may vary depending on the kind of persistence store you are using). Spring Data will provide an appropriate implementation of that interface at runtime. As an example:

public interface UserRepository extends MongoRepository<User, String> { 
        @Query("{ fullName: ?0 }")
        List<User> findByTheUsersFullName(String fullName);

        List<User> findByFullNameLike(String fullName, Sort sort);
}
...

Autowired UserRepository repo;

 

Throughout this article we will compare three of the sub projects for JPA, MongoDB and Neo4j. JPA is part of the JEE stack and defines a unified API for accessing relational databases and performing O/R mapping. MongoDB is a scalable, high-performance, open source, document-oriented database. Neo4j is a graph database, a fully transactional database that stores data structured as graphs.

All these Spring Data projects support the followings aspects:

  • Templating
  • Object/Datastore mapping
  • Repository support

Other Spring Data projects like Spring Data Redis or Spring Data Riak essentially provide only templates, because the corresponding datastores persist unstructured data that cannot be mapped or queried.

Let’s have a detailed look at templates, object mapping and repository support.

Templates

The main purpose of a Spring Data template (and all other Spring templates) is resource allocation and exception translation.

In our case a resource is a datastore, which is often accessed remotely over a TCP/IP connection. The following example shows how to configure a MongoDB template:

<!-- Connection to MongoDB server -->
<mongo:db-factory host="localhost" port="27017" dbname="test" /> 

<!-- MongoDB Template -->
<bean id="mongoTemplate"
class="org.springframework.data.mongodb.core.MongoTemplate">
  <constructor-arg name="mongoDbFactory" ref="mongoDbFactory"/> 
</bean>

First we define some kind of connection factory, which is then referenced by the MongoTemplate. In the case of MongoDB, the Spring Data project depends on the low level MongoDB Java driver.

In general, such a low level datastore API will have its own exception handling strategy. The Spring way of handling exceptions it to use unchecked exceptions that the developer can catch, but does not have to. To bridge that gap, a template implementation catches the low level exceptions and rethrows a corresponding unchecked Spring exception, which is a subclass of Spring’s DataAccessException.

A template offers store specific operations like saving, updating and deleting a single record or for executing queries or map/reduce jobs. But all these methods work only for the corresponding underlying datastore.

Spring Data JPA does not offer a template, since the JPA implementation itself is already an abstraction layer on top of the JDBC API. JPA’s EntityManager is the counterpart of a template. Exception translation is handled by the repository implementation.

Object/Datastore Mapping

JPA introduced a standard for O/R mapping (i.e. mapping object graphs to relational database tables). Hibernate is probably the most common O/R mapper that implements the JPA specification.

With Spring Data, this support is extended to NoSQL datastores with object-like data structures. But these data structures can be quite different from each other, so it would be difficult to make up a common API for object/datastore mapping. Each type of datastore comes with its own set of annotations to provide the needed meta information for the mapping. Let’s see how a simple domain object User may be mapped to our datastores:

JPA

MongoDB

Neo4j

@Entity
@Table(name="TUSR")
public class User {

  @Id
  private String id;

  @Column(name="fn")
  private String name;

  private Date lastLogin;

...
}
@Document(
collection="usr")
public class User {

  @Id
  private String id;

  @Field("fn")
  private String name;

  private Date lastLogin;

 ...
}
@NodeEntity
public class User {

  @GraphId
  Long id;


  private String name;

  private Date lastLogin;

...
}
 

If you are familiar with JPA entities you will recognize the standard JPA annotations. Spring Data reuses them; no other annotations are introduced. The mapping itself is done by the JPA implementation you are using. MongoDB and Neo4j require a similar set of annotations. You have one at the class level to map the class to a collection (in MongoDB a collection is similar to a table in a relational database) or to a node type (nodes and edges are the main data types of a graph database like Neo4j).

Each JPA entity has to have a unique identifier. The same is true for MongoDB documents and Neo4j nodes.

MongoDB uses an @Id annotation (not the same as JPA, it’s in the package org.springframework.data.annotation), Neo4j @GraphId. The values of these attributes are filled after persisting the domain object. For other persistent attributes, you use the @Field annotation if the attribute name in the MongoDB document does not match the Java attribute.

References to other objects are also supported. The roles of our user can be persisted this way:

JPA

MongoDB

Neo4j




@OneToMany
private List<Role> roles;




private List<Role> roles;
@RelatedTo(
type = "has",
direction = 
Direction.OUTGOING)
private List<Role> roles;

With JPA you use a @OneToMany relationship, the n side is stored in another table and usually queried with a join. MongoDB does not support joins between collections, referenced objects are stored in-place within the same document by default. You can have references to other documents as well which result in a client-side join. In Neo4j relationships are call edges, which are one of the basic data types.

To sum things up: MongoDB and Neo4j use an object mappings that is similar to the well known JPA O/R mapping, but it is not quite the same because of the different data structures. The concept behind all of the mappings is the same though: map Java objects to the data structures of your persistence store.

Repository Support

If you have ever persisted data in your business application, you probably wrote some kind of DAO. Usually you implement the CRUD operations for single records and a bunch of finder methods for each persistent class. Finder methods take parameters that are put into your query before you execute them.

With the advent of JPA, at least the CRUD operations are available through the EntityManager interface. Writing custom finders is still boring though: create a named query, set each parameter, execute that query. For example:

@Entity
@NamedQuery( name="myQuery", query = "SELECT u FROM User u where u.name = :name" )
public class User { 
...
} 

@Repository 
public class ClassicUserRepository { 

   @PersistenceContext EntityManager em; 

   public List<User> findByName(String Name) { 
      TypedQuery<User> q = getEntityManger().createNamedQuery("myQuery", User.class); 

      q.setParameter("name", fullName);

      return q.getResultList();
   } 
   ...

This can be slightly reduced by using the fluent interface of a TypedQuery ...

@Repository
public class ClassicUserRepository { 

   @PersistenceContext EntityManager em; 

   public List<User> findByName(String name) {
      return getEntityManger().createNamedQuery("myQuery", User.class)
         .setParameter("name", fullName)
         .getResultList(); 
   } 
   ...

... but still you are implementing a method that calls setters and executes the query for each and every query. With Spring Data JPA the same query comes down to the following piece of code:

package repositories; 

public interface UserRepository extends JpaRepository<User, String> {

   List<User> findByName(String name); 
}

 

With Spring Data JPA, JPQL queries don’t have to be declared as @NamedQuerys in the class file of the corresponding JPA entity. Instead a query is an annotation of the repository method(!):

@Transactional(timeout = 2, propagation = Propagation.REQUIRED)
@Query("SELECT u FROM User u WHERE u.name = 'User 3'")
List<User> findByGivenQuery();

All the above also holds true for Spring Data MongoDB and Spring Data Neo4j. The following example queries a Neo4j database with the Cipher query language:

public interface UserRepository extends GraphRepository<User> {

  User findByLogin(String login); 

  @Query("START root=node:User(login = 'root') MATCH root-[:knows]->friends RETURN friends")
  List<User> findFriendsOfRoot(); 
}

Of course, the naming conventions of the finder methods differ from persistence store to persistence store. For example, MongoDB supports geospatial queries and thus you can write queries like this:

public interface LocationRepository extends MongoRepository<Location, String> {

        List<Location> findByPositionWithin(Circle c);

        List<Location> findByPositionWithin(Box b);

        List<Location> findByPositionNear(Point p, Distance d);
}

There is also generic support for paging and sorting, by providing special parameters to the finder methods for all persistence stores.

The main advantages of repository support are:

  • The developer writes a lot less boilerplate code
  • Queries can by defined alongside the finder method and its documentation
  • As a bonus, the JPQL queries are compiled as soon as the Spring context is assembled, not the first time you use the query, which makes it easier to detect syntax errors

Summary

This article gives an overview of a lot of complex technologies and tries to discover similarities and differences. For a more detailed view of the Spring Data projects, I recommend taking a look at the projects’ homepages:

To answer the question given in the headline: no, there is no general API to all the persistence stores. The differences are too fundamental. But the Spring Data project does provide a common programming model to access your data:

The repository support is an abstraction from the physical data layer and also makes it very easy to write a lot of finder methods. With the object mapping, your domain object can be transformed into the data types of your persistence store. The templates provide low level access to store specific capabilities.

About the Author

Tobias Trelle is a senior IT consultant at codecentric AG, with over 15 years of experience in the IT business. His main interests are software architectures, EAI and cloud computing. A regular blogger, he also provides training and gives talks at conferences.
Twitter/Blog/Linked In/G+

Hello stranger!

You need to Register an InfoQ account or to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Difference between template CRUD/find and Repository CRUD/find operations by Abdul Azeez Shaik

Hi Toblas,
Thats a great article. It helped us to get the overview of Spring data and how to move forward. However, i do have few queries which are not resolved.
There are CRUD operations available through Templates as well as repositories. for example: Neo4jTemplate and GraphRepositories
I do understand that code for find operations is pretty simple in case of repositories. But, there are few operations which are infact same functionality through both of them. What would be performance effect in both the case? Is there any standard/suggestion that i can use repository for particular operations and template for few other operations.

Awaiting for your reply.

Thanks,
Abdul

Re: Difference between template CRUD/find and Repository CRUD/find operatio by Tobias Trelle

Hi Abdul,

the proxy implementation of a repository is created once the application context is assembled. This happens only once so do not expect performance issues here.

Some things can only be done by using the "low level" template itself, e.g. creating collections in MongoDB. In that case, you have to use the template. With JPA you don't even have a corresponding Spring Data template, so everything has to be done by using repositories. So the answer is: it depends.

Keep also in mind that the repository abstraction is usefull with simple to normal queries. We had some JPA queries with about 30 parameters where repository support doesn't fit at all.

Cheers
Tobias

Spring Data has a long way to go by Young Gary

Spring Data is not even reference auto wiring yet:

I have an object type Branch who has one parent type also is Branch and has a children which is a list of Branch, I saved a tree into mongoDB, and then read them out, any container running SpringData will die, because it causes dead circle.

-Gary

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

3 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT