Service Orientation Requires Data Orientation
Most of today’s SOA literature and implementations concentrate on defining business aligned services and rarely discuss the role and impact enterprise data has in the context of SOA. According to David Linthicum
Those moving toward SOA seem a bit confused by the use of data within a SOA. While most consider data as... well, data, those in the know understand that data needs to be a strategic part of the SOA for SOA to succeed as a project, or as an overall architectural strategy. The trouble comes in when attention is centered on the "S" in SOA, which stands for services. Those charged with building architectures and systems, who focus on the notion of a service as delivering functional behavior, neglect the need to manage the underlying data. In many cases, data quality and consistency issues quickly arise, and the agility that SOA should provide is limited by the need to alter services directly after the underlying data has changed.
David’s opinion about data integration’s importance as a foundation for SOA is further elaborated by Ash Parikh in a recent blog post
It is becoming increasingly clear that any effort to service-orient an infrastructure needs to start with a hard look at data integration... "data integration" needs to be top of mind for anyone architecting their infrastructure for speed and agility. So, if you are looking to service-orient your infrastructure, I would suggest that you data-orient first by doing a 5-point check to make sure that the foundation has the following capabilities:
- Easy access of all relevant data, including new or rapidly changing data sources.
- Processing of data as batch or real-time, including handling large volumes of large data sets.
- Proactive identification and resolution of data inaccuracies and inconsistencies.
- Application of complex data transformations on the data.
- Delivery of data, exactly when it is needed, as a standards-based data service.
In his follow-up post, Ash discuses practical approaches to data-orienting a service-oriented infrastructure. He outlines several prescriptive recommendations providing a holistic solution to a data integration problem for an enterprise:
- Start with a data integration platform enabling "standardized" access to all enterprise data sources regardless of their organization (structured, unstructured, etc.) and access mechanisms (SQL, APIs, Web Services, etc.). Ensure extensibility of this platform - ability to quickly add new data sources, or modify existing ones.
- Make sure that the chosen platform can effectively support any latency of data processing, be it batch, near real-time or change data capture and real-time.
- Understand different data access patterns including large data volumes of relatively small data sets, large volumes of large data sets, huge (mega) data sets, etc. and make sure that all of them are supported by the chosen data platform. Complement it with additional technologies, as required. .
- Make sure that data consumed and produced by services is consistent across services. Many data platforms can provide integrated data profiling to proactively identify issues and also fix these issues, regardless of the complexity.
- In addition to data access, a data platform typically provides a centralized place for data transformation. When it comes to a simple format conversion, any modern data integration platform can support it. If, however, there is a need for more complex transformations such as aggregation, joining, lookup, structure conversions, etc., you probably need something more sophisticated.
- A data integration platform typically needs to support multiple access mechanisms required by service implementations, including SQL-based access, Web and REST services, etc.
- A data integration platform has to insulate services implementation from the underlying data sources. It has to, in effect, introduce a layer of abstraction, allowing for changes in a data layer with no or minimal impact to service implementations.
In summary, a data platform
... needs to be a single, integrated platform that can deliver all the capabilities outlined. This makes sense, as all these recommendations were made with time and cost savings in mind. If a separate technology were to be employed for each of these capabilities, the very basis for employing a service-oriented approach would be compromised, which would be to enable agility through simplicity and flexibility... [this] platform must support and drive the reuse of data integration logic.
As the scope of SOA implementations expands from a limited departmental solution to an enterprise-wide undertaking, the issues of enterprise data access are quickly starting to become one of the most important implementation issues. If not architected correctly from the very beginning, enterprise data access can become a major problem down the road.
Shane Hastie on Distributed Agile Teams, Product Ownership and the Agile Manifesto Translation Program
Shane Hastie Apr 17, 2015