InfoQ Homepage Articles Architecting with Java Persistence: Patterns and Strategies

Java

Architecting with Java Persistence: Patterns and Strategies

This item in japanese

Jan 04, 2024 21 min read

Follow us on

Youtube232K Followers

Linkedin26K Followers

Key Takeaways

Java persistence patterns like Driver, Mapper, DAO, Active Record, and Repository are crucial for robust database interaction and application architecture, with each offering distinct data management approaches.
Balancing layers is essential for managing complexity and optimizing data flow, with each pattern having its own set of advantages and disadvantages to consider for informed architectural decisions.
The choice between object-oriented and data-oriented programming shapes software design, emphasizing the need to address impedance mismatches and maintain a balance between the two.
While these patterns offer robust solutions for Java persistence, careful consideration of trade-offs such as data synchronization and the potential for over-engineering is essential for effective implementation.
Command and Query Responsibility Segregation (CQRS) offers notable performance, scalability, and security advantages when distinctly separating read and update operations. However, the trade-off can be increased complexity within an application.

In the ever-evolving software architecture world, senior engineers, architects, and CTOs face a perennial challenge: designing a robust and efficient persistence layer for Java applications. A well-crafted persistence layer is not merely a technical detail; it's the backbone that supports the application's functionality, scalability, and long-term sustainability. This article will deeply dive into the complex realm of design patterns related to the Java persistence layer, explicitly focusing on distinguishing between object-oriented and data-oriented approaches.

The persistence layer in a Java application bridges the gap between an application's intricate business logic and the underlying data store, which is often a relational database. The choices made in this layer reverberate throughout the software's lifespan, influencing its performance, maintainability, and adaptability. To address this challenge, we must navigate the two primary paradigms in Java persistence.

Object-oriented and Data-oriented: Dealing with Impedance Mismatches

Every time we venture into the realm of handling database persistence engines and Java applications, we're faced with a fundamental challenge: bridging the gap between the paradigms of the application and the database itself. This transformation process often introduces an impedance mismatch that can significantly impact the application's performance and maintainability. It's a critical task because we deal with two entirely different principles and concepts when we compare Java to any database engine.

On one side of the spectrum, we have Java, a language that boasts inheritance, polymorphism, encapsulation, and a rich type system. These object-oriented concepts shape the way we design and build our applications. They provide a high level of abstraction and structure that helps us manage complexity and maintain code effectively.

Conversely, when we look at the database, we encounter a world dominated by concepts like normalization, denormalization, indexing, and query optimization. Databases focus on efficiently storing and retrieving data, often with performance as the paramount concern. The database doesn't inherently understand or support Java's object-oriented features, which can lead to impedance when trying to synchronize these two distinct worlds.

Figure 1: The mismatch between a database and the Java programming language

We rely on various design patterns and architectural approaches to bridge this gap and create a seamless connection between Java applications and databases. These patterns act as translators, helping to reduce the impact of the impedance mismatch and make the two worlds work together harmoniously.

These design patterns are not reinventing the wheel. They are well-established solutions that have proven effective in mitigating the impedance mismatch between application and database paradigms. They include the likes of the Driver Pattern, Mapper Pattern, Active Record Pattern, and Repository Pattern.

Navigating Data Patterns in Java Persistence

This section of the article will delve into data patterns in Java persistence, focusing on the nuanced differences between object-oriented programming and data-oriented programming in application development and database management. The distance between these two programming paradigms plays a critical role in shaping software design choices, and we’ll explore the trade-offs that come with each approach.

At one end of the spectrum, we have the classic Object-Oriented Programming (OOP) paradigm. Inspired by principles outlined in books like "Clean Code" by Robert Martin, OOP places a strong emphasis on the following key aspects:

Hiding Data to Expose Behavior: OOP encourages encapsulation, which involves hiding the internal data structures and exposing well-defined interfaces for behavior. This approach fosters modularity and maintainability by limiting data manipulation to controlled methods.
Polymorphism enables diverse objects to be treated as if they share common traits. In Java, it’s achieved through method overriding and overloading, allowing dynamic and adaptable method calls for different object types.
Abstraction simplifies complex concepts in the software by modeling classes based on real-world objects. In Java, it’s implemented using abstract classes and interfaces, ensuring consistent behavior while permitting various implementations.

On the other end, we embrace Data-Oriented Programming (DOP) principles, as defined by Yehonathan Sharvit, an experienced software engineer with over twenty years of expertise. These principles are particularly relevant when dealing with databases and data-intensive operations. DOP encourages the following practices:

Separating Code (Behavior) from Data: DOP promotes decoupling data manipulation logic from the data itself. This separation allows for greater flexibility and efficiency in data processing.
Representing Data with Generic Data Structures: Instead of relying on complex object hierarchies, DOP recommends using generic data structures for storage, enabling efficient data manipulation and processing.
Treating Data as Immutable: The Immutability of Data is a key DOP concept. Immutable data ensures that data changes are controlled and predictable, making it suitable for concurrent processing.
Separating Data Schema from Data Representation: DOP encourages separating the structure of the data (schema) from how it is represented. It enables flexibility and adaptability in data management.

As we venture further into this section, we must understand that the choice of data patterns in Java persistence is not one-size-fits-all. The selection of an appropriate pattern depends on the application’s specific requirements, the domain’s complexity, and the performance considerations. As we work with balancing consistency, availability, and partition tolerance (commonly referred to as "CAP theorem") in the world of data, we must recognize that no single pattern is perfect in all situations.

Finding the right balance between object-oriented and data-oriented programming is an ongoing endeavor. We'll explore various design patterns, including the Driver Pattern, Data Mapper, DAO, Active Record, and Repository, to understand how they fit into these paradigms and how they help bridge the gap between application logic and database interactions. Through this exploration, we aim to optimize our software architecture's performance, maintainability, and scalability, acknowledging that the process is continually evolving.

Figure 2: The data design patterns with the distance of programming style

To deepen your understanding of data-oriented programming in Java, you can also explore the article "Data-Oriented Programming in Java" by Brian Goetz.

Bridging the Gap: Patterns from Database to Objects

In exploring Java persistence design patterns, we’ll begin our journey at the core, close to the database itself, and gradually transition towards the object-oriented programming side. This approach allows us to first delve into patterns that interact directly with the database, emphasizing data-oriented principles and handling raw information. As we progress, we’ll focus on object-oriented programming, where data is transformed into application-specific entities. Moving from patterns close to the database to those aligned with object-oriented paradigms, we understand how to bridge the gap between data management and application logic, creating robust and efficient Java applications. Let’s uncover the patterns that seamlessly connect these two fundamental aspects of software development.

Driver Pattern

First, we discuss the Driver Pattern and its role in handling database communication. This pattern, closer to the database, offers a unique perspective on data-oriented programming, showcasing the flexibility it provides.

The Driver Pattern is primarily responsible for establishing the connection and communicating with the database. In many scenarios, this pattern is more fluid in the database layer, and you can witness its implementation in various samples and frameworks, such as JDBC Drivers for relational databases or communication layers for NoSQL databases like MongoDB and Cassandra.

The code snippet provides a straightforward example of using the Driver Pattern with Java and JDBC for database communication. This snippet demonstrates the extraction of data from a database table and showcases the immutability often associated with data-oriented programming:

try (Connection conn = DriverManager.getConnection(DB_URL, USER, PASS);
     Statement stmt = conn.createStatement();
     ResultSet rs = stmt.executeQuery(QUERY);) {
    // Extract data from result set
    while (rs.next()) {
        // Retrieve data by column name
        System.out.print("ID: " + rs.getInt("id"));
        System.out.print(", name: " + rs.getString("name"));
        System.out.print(", birthday: " + rs.getString("birthday"));
        System.out.print(", city: " + rs.getString("city"));
        System.out.println(", street: " + rs.getString("street"));
        // Handle and process data as needed...
    }
}

The ResultSet behaves like a read-only map in this code, offering getter methods to access data from the database query result. This approach aligns with the principles of data-oriented programming, emphasizing data immutability.

On the one hand, the Driver Pattern and this data-oriented approach provide flexibility in dealing with data, allowing you to handle it as a first-class entity from the data perspective. However, this flexibility also introduces the need for additional code when converting data to application-specific entities, potentially leading to increased complexity and the possibility of introducing bugs.

The Driver Pattern exemplifies that the closer we get to the database in our application architecture, the more we interact with data as raw information, which can benefit specific scenarios. Nevertheless, it highlights the importance of thoughtful design and abstraction when transitioning data from the database to the application layer, significantly reducing complexity and potential errors.

In the context of the Driver Pattern, we get a glimpse of data-oriented programming, which offers significant flexibility in handling raw data. However, this often necessitates converting this data into meaningful representations for our business domains, especially when dealing with Domain-Driven Design (DDD). We introduce the "Data Mapper" pattern to facilitate this conversion, a powerful tool in Patterns of Enterprise Application Architecture.

Data Mapper Pattern

The Data Mapper is a crucial layer mediating between objects and a database while ensuring their independence from each other and the mapper itself. It provides a centralized approach to bridge data-oriented and object-oriented programming paradigms. However, it’s important to note that while this pattern simplifies the data-to-object conversion process, it also introduces the potential for impedance mismatches.

The implementation of the Data Mapper pattern can be observed in several frameworks, such as Jakarta Persistence, formerly known as JPA. Jakarta Persistence allows for the mapping of entities using annotations to create a seamless connection between the database and objects. The following code snippet demonstrates how to map a "Person" entity using Jakarta Persistence annotations:

@Entity
public class Person {
    @Id @GeneratedValue(strategy = GenerationType.AUTO) 
    Long id;
    String name;

    LocalDate birthday;
    @ManyToOne List<Address> address;
    // ...
}

Furthermore, alternative methods can be employed where annotations are not preferred. For instance, the Spring JDBC template provides a flexible approach. You can create a custom "PersonRowMapper" class to map database rows to the "Person" entity, as demonstrated below:

public class PersonRowMapper implements RowMapper<Person> {
    @Override
    public Person mapRow(ResultSet rs, int rowNum) throws SQLException {
        Person person = new Person();
        person.setId(rs.getInt("ID"));
        // Populate other fields as needed
        return person;
    }
}

The Data Mapper pattern isn’t confined to relational databases. You can also witness its implementation in NoSQL databases through annotations or manual data-to-object conversion. This versatility makes the Data Mapper pattern a valuable asset in handling data across various database technologies while maintaining a clear separation between the data and the domain model.

Data Access Object (DAO)

Indeed, the Mapper pattern provides an effective way to centralize the conversion between the database and entity representations, offering substantial benefits in terms of code testing and maintenance. Additionally, it allows for the consolidation of database operations within a dedicated layer. One of the prominent patterns in this category is the Data Access Object (DAO) pattern, which specializes in providing data operations while shielding the application from intricate database details.

The DAO serves as a critical component that abstracts and encapsulates all interactions with the data source. It effectively manages the connection with the data source to retrieve and store data, while maintaining a clear separation between the database and the application’s business logic. This clear separation of concerns enables a robust and maintainable architecture.

One of the distinct advantages of the DAO pattern is the strict separation it enforces between two parts of an application that have no need to be aware of each other. This separation allows them to evolve independently and frequently. When the business logic changes, it can rely on a consistent DAO interface, while modifications to the persistence logic do not impact DAO clients.

In the provided code snippet, you can observe an example of a DAO interface for a "Person" entity. This interface abstracts the data access operations, making it a powerful tool for managing data within Java applications:

public interface PersonDAO {
   Optional<Person> findById(Long id);
   List<Person> findAll();
   Person update(Person person);
   void delete(Person person);
   Person insert(Person person);
}

It’s important to note that even though the DAO abstracts and encapsulates data access, it implicitly relies on the Mapper pattern to handle the conversion between the database and entity representations. Consequently, this interplay between patterns may introduce an impedance mismatch, which is a challenge to address in database operations.

The DAO pattern is versatile and can be implemented with various database frameworks, including both SQL and NoSQL technologies. For instance, when working with Java Persistence, you can create an implementation of the "PersonDAO" interface to facilitate database operations for the "Person" entity. This implementation would effectively leverage the Mapper pattern to bridge the gap between the database and the application’s domain entities.

This code snippet represents a simplified Jakarta Persistence implementation of the PersonDAO. It demonstrates how you can use the Jakarta Persistence API to interact with a database and perform common data access operations. The findById method retrieves a Person entity by its unique identifier, while the findAll method fetches a list of all Person entities in the database. These methods provide a foundation for seamless integration between your Java application and the underlying database, exemplifying the power and simplicity of Jakarta Persistence for data access.

public class JakartaPersistencePersonDAO implements PersonDAO {
    
    private EntityManager entityManager;
        
    @Override
    public Optional<Person> findById(Long id) {
        return Optional.ofNullable(entityManager.find(Person.class, id));
    }
    
    @Override
    public List<User> findAll() {
        Query query = entityManager.createQuery("SELECT p FROM Person p");
        return query.getResultList();
    }
\\...more 
}

Active Record Pattern

Next, we encounter the Active Record pattern in the expansion of persistence design patterns. This pattern empowers an entity by allowing it to directly integrate with the database based on its inheritance, effectively granting it the "superpower" of self-managing database operations. This approach simplifies database integration by consolidating operations into the entity itself. However, it comes with trade-offs, including tight coupling and a potential violation of the single responsibility principle.

The Active Record pattern gained popularity in the Ruby community and made its way into Java, mainly through the Quarkus framework with the Panache project. Panache simplifies database integration in Java by implementing the Active Record pattern, enabling entities to perform database operations without needing a separate data access layer.

Here’s an example of an Active Record implementation using Quarkus Panache and a Person entity:

@Entity
public class Person extends PanacheEntity {
    public String name;
    public LocalDate birthday;
    public List<Address> addresses;
}

// Create a new Person and persist it to the database
Person person = ...;
person.persist();

// Retrieve a list of all Person records from the database
List<Person> people = Person.listAll();

// Find a specific Person by ID
person = Person.findById(personId);

In this code, the Person entity extends PanacheEntity, part of the Quarkus Panache project. As a result, the Person entity inherits methods like persist(), listAll(), and findById() for database operations. It means that the Person entity can self-manage its interactions with the database.

While the Active Record pattern simplifies database operations and reduces the need for a separate data access layer, it’s essential to consider the trade-offs. Tight coupling between the entity and the database and the potential violation of the single responsibility principle are factors to weigh when deciding to adopt this pattern. Depending on the specific requirements of your application and the trade-offs you are willing to accept, the Active Record pattern can be a powerful tool for streamlining database integration in Java.

Repository Pattern

As we continue to explore Java persistence design patterns, we encounter the Repository pattern, which represents a significant shift towards a more domain-centric approach. The Repository mediates between the domain and data mapping layers, introducing an in-memory domain object collection that aligns with the application’s ubiquitous language. This pattern emphasizes a domain-specific semantic, fostering a more natural and expressive interaction with the data.

The main differentiator between the Repository and the previously discussed DAO is the focus on the domain. While DAOs concentrate on database integration with operations like insert and update, Repositories introduce more declarative methods that closely adhere to the domain’s language. This abstraction allows for a semantic alignment with the domain while still managing the distance between the application and the database.

In the Java ecosystem, several frameworks and specifications support the Repository pattern. Prominent examples include Spring Data, Micronaut Data, and the Jakarta Data specification.

Let’s take a look at the implementation of the Person class using Jakarta Persistence annotations and explore how the Repository pattern can be leveraged with Jakarta Data:

@Entity
public class Person {
    private @Id Long id;
    private @Column String name;
    private @Column LocalDate birthday;
    private @ManyToOne List<Address> addresses;
}

public interface People extends CrudRepository<Person, Long> {}

Person person = ...;

// Persist a person using the repository
repository.save(person);

// Retrieve a list of all persons
List<Person> people = Repository.findAll();

// Find a specific person by ID
person = repository.findById(personId);

In this code, the Person entity is annotated with Jakarta Persistence annotations, and we introduce a People interface that extends the CrudRepository provided by Jakarta Data. This repository interface leverages the Repository pattern, offering declarative methods like save, findAll, and findById. These methods provide a more domain-oriented, expressive, and semantically aligned way to interact with the database, contributing to the clarity and maintainability of the codebase.

Continuing with our exploration of domain-centric repositories and the evolving Jakarta Data specification, let’s consider a practical example involving a Car and a Garage repository. In this scenario, we aim to create a custom repository that aligns closely with the domain and leverages action annotations to express parking and unparking a car within the Repository.

Here’s the code illustrating this concept:

@Repository
public interface Garage {

    @Save
    Car park(Car car);

    @Delete
    void unpark(Car car);
}

This approach provides a highly expressive and domain-centric way to interact with the Garage repository. It aligns the Repository’s methods closely with the domain language and actions, making the code more intuitive and self-descriptive. The use of annotations, such as @Save and @Delete, clarifies the intention behind these methods, facilitating the development of domain-driven data access layers in Java applications.

While the Repository pattern introduces a valuable domain-centric perspective, balancing semantic clarity and the potential challenges of managing the distance between the domain and the database is essential. Depending on your project’s requirements, the Repository pattern can be a powerful tool for creating more expressive and domain-driven data access layers in Java applications.

Overview of Key Data-oriented Patterns

Before delving deeper into Java persistence patterns, we'll start with an overview of key patterns. These include the Driver, Mapper, DAO, Active Record, and Repository patterns. This summary outlines their strengths and weaknesses, providing a foundation for our exploration. It's the first step in understanding how these patterns shape Java applications and guide us as we move on to more advanced concepts.

Pattern	Advantages	Disadvantages
Driver	- Provides a direct approach to database communication. - Allows low-level control over database interactions - Useful for specific database optimizations and non-standard use cases. - Direct use of drivers allows for fine-tuning queries for performance optimizations	- Often tightly coupled with a specific database system, reducing portability. - Can result in verbose and low-level code, impacting maintainability.
Data Mapper	- Separates database access from domain logic, promoting clean architecture. - Supports custom mapping and transformation of data. - Enhances testability with a clear separation of concerns.	- Can require boilerplate code for mapping between database and domain objects. - May introduce complexity in mapping logic, impacting development time.
DAO	- Separates data access code from domain logic, promoting modularity. - Supports multiple data sources and complex queries. - Enhances testability by isolating data access logic. - Can be reused across different parts of an application.	- May require more code compared to Active Record, potentially impacting development speed. - May introduce an additional layer, which can increase project complexity.
Active Record	- Simplifies database operations, allowing domain entities to manage database integration. - Provides a concise and expressive API for data manipulation. - Reduces the need for a separate data access layer.	- This may lead to tight coupling between domain logic and database concerns. - Violates the single responsibility principle, potentially affecting maintainability.
Repository	- Aligns domain language with data access, making code more expressive. - Encourages a domain-driven design by introducing domain-specific methods. - Provides a clear separation between the domain and data access layers.	- Requires an additional layer, potentially introducing complexity. - The implementation of domain-specific methods may vary and be complex.

Indeed, as we transition from data programming to a more domain-centric approach, we often find the need to introduce layers to mediate the communication between different parts of the application. Each pattern occupies a distinct layer, and this layered architecture introduces a level of abstraction that separates the application's core logic from the database operations.

Figure 3: The number of layers differs with data-oriented programming vs a domain-centric approach

Data Transfer Object (DTO)

Moving forward, we encounter a widely used and versatile pattern called the Data Transfer Object (DTO). This pattern serves various purposes, including seamless data movement across different layers or tiers, such as when extracting data for JSON representation in a RESTful API. Additionally, DTOs can isolate the entity from the database schema, allowing for a transparent relationship between the entity and various database models.

This adaptability permits the application to work with multiple databases as potential targets without affecting the core entity structure. These are just two of the many use cases for DTOs that demonstrate their flexibility.

Figure 4: Two samples of using the DTO in Java applications

However, it's essential to remember that while DTOs offer numerous benefits, they require diligent management of data conversion to ensure the correct isolation between layers. The use of DTOs brings about the challenge of maintaining consistency and coherence across different application parts, which is a crucial aspect of their successful implementation.

Command Query Responsibility Segregation (CQRS)

As we've explored the significance of layers and Data Transfer Objects (DTOs) on this journey, we now arrive at the Command and Query Responsibility Segregation (CQRS) pattern. CQRS is a powerful architectural strategy separating read and update operations within a data store. It's important to note that CQRS's application can significantly complement the use of DTOs in your architecture.

The implementation of CQRS in your application can bring a multitude of benefits, including the maximization of performance, scalability, and security. You can effectively manage the data transfer between the read and write sides of your CQRS architecture using DTOs. This ensures that data is appropriately formatted and transformed between these segregated responsibilities.

For those well-versed in NoSQL databases, the concept of CQRS may already feel quite familiar. NoSQL databases often follow a query-driven modeling approach, where data is optimized for retrieval rather than updates. In this context, CQRS's separation of read and write operations aligns seamlessly with the database's native behavior.

However, it's essential to approach CQRS with a nuanced understanding. While it can offer advantages, it introduces complexities, and its adoption should be carefully weighed against the specific requirements of your application. Some potential disadvantages include:

Increased Complexity: Implementing CQRS introduces additional layers and separation of concerns, which can lead to increased complexity in the overall system architecture. This complexity may impact development time, debugging, and the learning curve for the development team.
Synchronization Challenges: Maintaining consistency between the read and write sides of the system can be challenging. As updates are separated from reads, ensuring synchronized and up-to-date views for users may require careful consideration and additional mechanisms.
Potential for Over-Engineering: In more straightforward applications, introducing CQRS may need to be revised and could lead to over-engineering. It's crucial to assess whether the benefits justify the added complexity, especially in projects with straightforward data access requirements.

While CQRS can offer advantages, it comes with trade-offs, and its adoption should be carefully weighed against the specific requirements of your application. The synergy between DTOs and CQRS can indeed empower efficient data transfer within your application's architecture. Still, it's crucial to recognize that the benefits come with challenges, and a thoughtful evaluation of the overall impact on your system's complexity, maintainability, and development velocity is necessary.

Combining DTOs and CQRS can empower you to manage data transfer within your application's architecture efficiently. By maintaining a clear separation between read and write operations and using DTOs as intermediaries, you can enjoy the performance, scalability, and security benefits that CQRS offers while seamlessly adapting to query-driven NoSQL environments, as the following diagram illustrates:

Image Source

Conclusion

In exploring Java persistence patterns, we've uncovered various strategies designed to address specific application needs and architectural goals. Patterns like Driver, Mapper, DAO, Active Record, and Repository provide essential building blocks for data management in Java applications. They highlight the significance of striking the right balance between layers and facilitating a structured approach while being mindful of potential performance implications.

Data Transfer Objects (DTOs) emerged as a versatile tool for seamless data transfer between layers and adaptability to various data models. Their use, however, requires careful data conversion management to ensure uniformity across application components.

Finally, we ventured into Command and Query Responsibility Segregation (CQRS), a pattern that separates read and update operations. CQRS's implementation promises powerful performance, scalability, and security benefits, especially in environments where query-driven modeling, as found in NoSQL databases, prevails.

These patterns are the foundation for designing Java applications that align precisely with unique requirements and business objectives. As developers, understanding their strengths and limitations empowers us to make well-informed architectural decisions, ensuring that our applications are efficient but also robust and responsive.

About the Author

Otavio Santana

Show moreShow less

InfoQ Software Architects' Newsletter