BT

Datawire Release Fault Tolerant Microservice Communication Framework ‘Datawire Connect’

| by Daniel Bryant Follow 443 Followers on Apr 26, 2016. Estimated reading time: 9 minutes |

A note to our readers: You asked so we have developed a set of features that allow you to reduce the noise: you can get email and web notifications for topics you are interested in. Learn more about our new features.

Datawire have released their open source Datawire Connect framework, which allows developers to ‘resiliently connect microservices’ using automatically generated RPC-style client libraries for Java, Python or NodeJS services. The client libraries generated provide service registration and discovery, dynamic load balancing and routing, automated timeouts, and circuit breakers.

Datawire Connect is built on Quark, a language designed for expressing the contract between services. Similar to a traditional IDL, such as Thrift, Avro and gRPC, Quark enables the definition of service APIs and how data is serialized. Quark further extends the notion of a traditional IDL and allows protocol behaviors to be expressed as part of a service contract. For example, it is possible to define how clients of a service should behave if the target service is running slowly by adding circuit breakers, retry semantics, or backpressure to improve performance.

InfoQ sat down with Rafeal Schloming CTO and Chief Architect of Datawire, and asked about the difficulties of building distributed systems, current microservice tooling, what makes a ‘good’ service contract or API, and what the migration from traditional architectures to microservices might look like.

InfoQ: Hi Rafael, thanks for talking to InfoQ today. You mention in your recent presentation that systems are becoming more ‘distributed’ not just because of the nature of the problem. Could you explain a little more about this please?

Schloming: Software as a service has brought classic distributed systems to the mainstream in order to cope with both continuous uptime and the operational scale required. But SaaS businesses are now competing on feature depth, breadth, and rate of development/innovation. This is what causes the “other” kind of distributed system to emerge.

In order to build features faster, you need more developers, and in order to have more and more devs working in the same problem domain and not step on each other’s toes, you need a way for them to all do their work independently. This is how you get 10s or 100s of teams each updating small services that together form a larger application, i.e. distributed development of a single continuous uptime distributed application.

When this happens, the way that the application is distributed has as much to do with human factors like the org structure as it does with the problem domain. Think Conway’s Law meets continuous uptime distributed systems.

The kind of tooling you need to build to support this kind of distributed system is substantially different from what you need for traditional distributed systems. This is what I refer to as “the other kind of distributed system.”

InfoQ: How challenging is distributed development? Do you think the emergence of cross-functional teams is a solution? And is co-location a must for solving certain categories of problem?

Schloming: Distributed development is hard, but it’s also unavoidable, so the interesting question for me is “how do we make distributed development work?”

Co-location isn’t really an answer. A large enough org is doing distributed development even if they all work in the same office. A couple hundred engineers in the same room will have to do distributed development because they can’t all understand what everyone else is working on.

Distributed development seems to work best when a problem can be broken up into smaller pieces that only interact with each other across loosely-coupled interfaces. In a distributed team (or indeed in any team), APIs form the boundary across which different teams communicate, and so the more likely it is that changes in one component cascade to changes in the next, the more teams will need to communicate with each other, and the less well distributed development will work.

I view the emergence of cross-functional teams not as a solution to, but as a consequence of successful distributed development. Distributing responsibility for a self contained part with a bounded interface to the whole is an extremely effective way to scale a development organization, and doing this in the context of a SaaS organization inherently involves delegating responsibility for delivery of a service all the way to the end user. Once you do this, the team responsible will inevitably include enough cross functional skills to achieve that goal.

InfoQ: Can you comment on the current state of microservice tooling? Where do you see this going in the next 12 months?

Schloming: The current state of microservices tooling is very very early. By some measures you could probably say it’s non-existent. The reason I say this is because there are so many ducks you need to line up in order to successfully do microservices. You need service discovery, service routing, you need some kind of deployment pipeline, you need some kind of new testing methodology for each of the traditional categories of testing (unit, functional, and integration), you need better introspection tools to see the state of your thousands of services and instances. You need some kind of logging and monitoring solution.

While lots of people have built many of these pieces, there really is no ‘out of the box’ solution here. An end-to-end microservices platform is still an ambitious DIY exercise at this point, even for large well-funded organizations. Even if you try to adopt the stacks coming out of the early microservices organizations, your engineers will still have to wire everything together because the state of the tooling across all these categories is very fragmented.

At Datawire we are looking at the fragmentation of the industry and building what I consider to be the only holistic set of tooling for microservices. The goal of our open source product, Datawire Connect, is to provide all the tooling an organization needs to easily get started with microservices without having a 25 person team dedicated entirely to building microservices infrastructure from scratch.

InfoQ: There has been lots of talk over the previous year about the importance of defining  service boundaries and contracts. Some people advocate for 'schema up front', while others believe evolutionary consumer-based contracts could be best. Can you share your thoughts?

Schloming: I don’t really think of this as an either/or. The important thing to recognize is that where independent software components connect, there are generally two different teams. This is true whether one team is a business and one is a customer, or if it’s two teams working within the same business. This dynamic results in changes to the surface area where the teams connect very expensive and slow because you need to wait for cross-team communication to happen. If enough teams using the surface area of a given component, it can literally take years to roll out a change.

Because of this, it is valuable to have a solid understanding of what the surface area of your service is. That doesn’t necessarily mean you need to use any specific kind of tooling, and it doesn’t mean you need to spend months with a committee designing the perfect boundary upfront. Starting with a minimal consumer-based contract, and having an explicit way to represent it (whether via formal schemas or more ad-hoc techniques) are both valuable and very complimentary practices.

InfoQ: What makes a good service contract? And how best to implement this in code?

Schloming: This is a really hard question in general. When you adopt a microservices development methodology, the service becomes a unit of abstraction, almost akin to a class or a function in traditional programming. Imagine how tough it is to answer the question, “what makes a good function” or “what makes a good class.”

Normally people answer this question apophatically, in the negative, saying what a good class or function isn’t. It isn’t too large, it isn’t too complex, etc. Those same answers also apply to a service contract. It’s worth calling out that it shouldn’t be too tightly coupled. While this is true for well-designed APIs in general, this is even more important for services. It should also be explicit. If your contract is too ambiguous you won’t realize when you break things, and you may well end up having to maintain bug compatibility as a result.

As for how best to implement a service contract, I think this is still very much of an open area. Most tooling around contracts focus primarily on structural contracts, not behavioral contracts, and behavioral contracts are an increasingly important factor in building distributed applications. As far as I know, Datawire Connect is the only project out there building tooling that gives you a first class way to represent behavioral contracts. However, I’m always keen to learn about others!

I would say the most important factor in building a good service contract is to have *some* way to capture it explicitly that is *outside* your implementation. Whether you do this with a schema tool, a test suite, or just a document, the danger of not doing this is that the contract ends up being implicitly defined by the details of your implementation, and that kind of contract is much more expensive to maintain.

InfoQ: How difficult will developers working in an enterprise (perhaps with classical SOA) find the transition to writing microservices and the associated ecosystems?

Schloming: There are two sources of difficulty here, and it helps to think about them separately. One is the relatively early state of tooling in general provides a pretty steep learning curve. This is perhaps a more shallow obstacle, both because the state of tooling will improve over time, and not all developers actually need to understand the details of all the tooling in order to be productive in a microservices environment.

The bigger obstacle is that you need a different mental model of the world. If you wire together even a handful of microservices the same way you would use libraries or the larger grain services common to enterprise SOA then you’ll create an incredibly fragile system.

This is because the underlying assumption with traditional programming or even enterprise SOA is that they way you keep a composite application functioning is by making sure all of its components are functioning. This is kind of like old-style christmas tree lights. If each microservice is a lightbulb and you wire them together in series, then when one light bulb goes out, not only do you lose the whole strand, but unless you were watching really carefully at just the right moment, it’s really hard to figure out which bulb is at fault and get the whole thing operating again.

With microservices, all the components you are building on will be frequently updated in a way that is entirely outside of your control and they *will* break. Because of this, you need to wire up your light bulbs in parallel as much as possible so when one goes out it’s obvious what is wrong and it is quick and easy to fix.

This requires a very fundamental change in your mental model of how to construct systems. This starts with things like timeouts, client side load balancing, and circuit breakers, but it doesn’t stop there. It’s great if the client you use to access other services doesn’t hang when those services are down, but if it returns a null or default value when there is an outage and a few lines later your code barfs on this value, then you will still have a fragile system.

As a developer writing a microservice you need to think about what it takes to make your service as isolated from outside failures as possible, and I think this is one of the biggest challenges for developers that are new to this kind of system.

Additional information on the open source Datawire Connect framework can be found within the project’s GitHub repository and on the announcement of the Datawire blog.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT