How Difficult Can It Be to Integrate Software Development Tools? The Hard Truth
Development teams often have a bad case of the "cobbler's children". In that story, the children of the village shoemaker had no shoes. In a cruel twist of fate, enterprise IT teams tasked with creating the organization's custom software and integrating off-the-shelf applications often must use fractured, unintegrated tools. Their business partners would never stand for the inefficiencies and lack of managerial visibility suffered if their business systems weren't integrated. Yet, for some reason, we find software development and delivery teams with fragmented tool stacks.
So why is it that these organizations continue to suffer the inefficiencies, productivity drain, mind-numbing status meetings, inhibitors to collaboration and lack of visibility that are natural manifestations of unintegrated systems? Because integrating the various tools used in software development and delivery is actually very hard. Yet software development teams seem to think they should do be able to do it in-house fairly easily rather than using a third party.
I recently met with an organization that previously had built its own integrations but found them impossible to evolve and support. This client noted that its integration between only two ALM tools involved more than two million lines of code and had become unmanageable. It had started small, but as it expanded, the integration logic needed to handle every new use case had exploded its size. At two million lines of code, it was starting to cost millions of dollars to maintain.
What appears to be a fairly straightforward task is actually quite a bit more difficult than it may appear. You see, getting these endpoints to inter-operate is not a purely technical challenge. In fact, it's more of a business problem. While there are a couple of choices one can make in selecting the technical integration infrastructure (integration via APIs or at the database layer), the real challenges have more to do with the friction caused by the dissimilarities among these tools and how to overcome them in order to effect a flawless integration.
Connecting two systems
Experience has shown that integrating two applications using database integration techniques is brittle and using the endpoint tools' APIs is the only sensible way to go. So conduct some learning experiments by using these APIs to connect two endpoints. First try a simple case of integrating two systems with the "same" artifact type, say defects. Then explore some of the complications that occur when you want to expand beyond that.
Synchronizing "like" data
The first thing you might try is to mirror an artifact type that exists in both tools, e.g. a defect, from one tool into another. This should be relatively straight-forward since both systems have the concept of a "defect" artifact type.
To do this, you'll need to:
- Get a fundamental understanding of what the two systems do. One might be a defect tracking tool and the other an agile planning tool.
- Learn the object models of both systems, identifying the attributes (defect name, reporter, status, description, attachments, etc.) that exist on defects in both systems.
- Understand how to use their (REST) APIs to search, create and update the objects (defects) in both systems.
- Detect when an artifact has been created or modified.
- Be careful, one of the most common and dangerous mistakes made while creating integrations is to use an excessive number of API calls to the endpoint systems. For example, an inefficient polling and update mechanism will bring your production systems to their knees! At the very least, you will want to be sure to create a solution that avoids point-to-point (pairwise) integration because that kind of architecture multiplies the number of API calls to each endpoint.
Is it really "like" data?
Immediately you'll find that while both systems have the notion of the same object (defects), there are several inconsistencies between the systems:
- They have different attributes and therefore different object models. After all, each system manages objects that are relevant to its domain; a tester (and test management tool) has a different view of defects than a developer colleague (and the agile planning tool).
- Even when they have similar attributes, their values and formats will differ.
- Conflicts will invariably occur as artifacts, comments and attachments are simultaneously changed by the users of the endpoint tools.
- Of course, each of the artifacts in these endpoint systems are continually extended by the end user with custom attributes.
The "low-water bar" for integrations is discovering and accommodating the differences between "like objects" (such as defects) in multiple systems, so that their ongoing synchronizations are flawless. Although a few of these conflicts are noted, the bad news is that we have not even begun to address some of the more challenging nuances in integration.
It's not really "flat data"
Artifacts do not live independent of one another. The relationships among artifacts provide the context of the work. For example:
- Epics are broken down into the numerous user stories that implement them.
- There are containers used for organization (folders) and those used for scheduling.
- Relationships form the basis of important traceability and context, such as the relationship between a requirement, the tests that cover that requirement and the defects that result from those tests.
Across the tools used in the software development and delivery process, there are various forms relationships among artifacts. Mirroring these relationships from one tool to the others is crucial for maintaining the context of the work. For a developer, attempting to fix a defect without the context of how it was uncovered is likely to be a frustrating waste of time.
Managing conflict requires expert understanding
A recap: There are important initial steps for understanding how to create integrations among tools. Using a tool's API is far superior to attempting to access the information stored in the tool's underlying database. And the difficulty in creating a robust integration has a lot to do with resolving differences among the artifacts and systems that manage them.
With that in mind, uncovering these differences and determining how to resolve them requires expertise in the tools and how they're used. The first step is always to do an exhaustive technical analysis of the tool:
- How is the tool used in practice?
- What are the object models that represent the artifacts, projects, users and workflows inherent in the tool?
- What are the standard artifacts and attributes, and how do we (quickly and easily) handle continual customizations such as additions and changes to the standard objects?
- Are there restrictions on the transitions of the status of an artifact?
APIs are only part of the answer
Certainly, we can leverage APIs to gain access to the endpoint's capabilities and the artifacts they manage. We can also leverage them to prevent us from engaging in "illegal" activity (since the API enforces this kind of business logic). But here's a little secret: Many of these APIs were actually created for the vendor's convenience in building a tiered architecture -- endpoint vendors do not necessarily build their APIs to enable third-party integration!
As a result, these APIs are often poorly documented and incomplete:
- Data structures, how and when calls can be made (for stateful systems), and the side effects of operations are all often excluded from the documentation.
- Poor error handling and error messages are common.
- Edge cases and bugs are rarely documented.
- Because they're not documented, figuring out how to handle these issues requires a great deal of trial and error. And sadly, often the vendor's customer support staff is unaware of many of these issues and how to use their API, so finding resolution often requires access to the endpoint vendor's development teams.
And then a tool gets upgraded
What's worse, these APIs can change as the endpoint vendors upgrade their tools. Depending on how thoroughly the vendor tests, documents and notifies users of API changes, these changes can break the carefully crafted integrations. For on-demand or SaaS applications, these upgrades happen frequently and sometimes fairly silently.
Connecting more than two endpoints
Up to this point, this article has only addressed the issues that arise when creating integrations between two endpoints. It would seem natural to assume that once you've figured out how to deal with these issues, the learning curve for establishing an integration to a third endpoint would be less steep than the one you've just endured. By successfully creating a single integration, you've made some decisions and learned about some of the fundamental tenets of integration. You've established that you'll use APIs to do the integration, you've created a development environment for your work, you've gotten your feet wet in learning about the APIs of your endpoints and you've created a plan for how to deal with the continual changes that will come up as the endpoints are upgraded. Presumably, when you add a third endpoint to this scenario, you won't have the overhead of those tasks and the integration will be smoother.
Unfortunately, while the learning curve is not as steep as the first integration, it does not tend to flatten as significantly as one would hope. Some of the issues that reared their ugly heads while performing the first integration will reappear. You will have to perform a technical analysis of how the tool operates, how its artifacts are represented and how to reconcile the differences with the other tools in your integration landscape. Once again, the APIs will not be as well documented as you'd like and you'll have to uncover what you don't know through more trial and error. And you'll encounter other unforeseen circumstances that are a direct result of the inherent conflicts that arise when trying to integrate tools that weren't necessarily designed for integration.
What's worse, however, is that as you add a fourth, fifth and more endpoints, the complexity will actually increase. The conflicts among tools increase as you add tools from completely different domains, with artifacts that are not as similar as the artifacts you initially mirrored from system to system.
Architecturally speaking, creating point-to-point integrations among three or more endpoints creates both a performance and maintenance nightmare. From a performance perspective, making all those API calls to the endpoints will slow the responsiveness of the tools to their users. Too many integration projects have been scuttled when the end users complained that their systems no longer performed as well as before the integrations were turned on!
From a code maintenance perspective, the testing and ongoing maintenance of an integration landscape with three or more endpoints becomes exponentially more difficult as more endpoints are added. As is the case with all software development projects, it's as important to test the software as it is to build it. You'll want to set up a robust test infrastructure that is separate from your production systems, but still tests against live versions of those products. The time and expense of setting up this test infrastructure is non-trivial and must be factored in to your plans and budgets.
I've seen firsthand the struggles that organizations go through as they attempt to integrate the applications they depend on to run their business and it's really not possible to articulate all of the issues in one article. But, I'm hopeful that you now have a good sense of some challenges involved in software lifecycle integration, including the friction caused by the dissimilarities among these tools. The rewards of having your tools work together are numerous, so undertaking an integration project can be well worth the effort. But the inescapable fact is that integration is hard. So consider the hidden "gotchas" fully before signing off on your next in-house integration project -- and avoid the inefficiencies and lack of visibility that result from lengthy and costly failed projects.
About the Author
Betty Zakheim is VP of Industry Strategy at Tasktop. Betty's role is at the vertex between Tasktop's customers, the company's product team and marketing team. She has an extensive background in software development, software integration technologies and software development tools. As a software development manager, she was an early adaptor of “iterative development,” the precursor to Agile. As the VP of product management and marketing at InConcert (acquired by TIBCO), she pioneered the use of Business Process Management (then called “workflow) as the semantic framework for enterprise application integration. Betty holds undergraduate degrees in Psychology, Advertising (from the S.I. Newhouse College of Public Communications) and is a Tau Beta Pi scholar in Computer Engineering. She also received a Master of Computer Science degree from Boston University.