Cloud Native Architectures - a Conversation with Matt Stine
Rags Srinivas caught up with Matt Stine at the O'Reilly Architecture conference in Boston, MA. Matt talks about Cloud Native Architectures and some of the cultural and technological challenges. He talks about some of the NetFlix services and how Spring is wrapping it up to be able to architect and develop microservices on the platform. He also talks about SOA and what it probably missed out.
InfoQ: Hello and welcome. With me I have Matt Stine from Pivotal and we are at the O'Reilly Architect Conference in Boston, Massachusetts. So Matt, if you can take a moment to introduce yourself to the InfoQ audience, that will be great.
Matt: Sure. Yes, as you said, I'm Matt Stine. I've been with Pivotal and working in some form with the Cloud Foundry team since Pivotal’s inception a couple of years ago. I've been doing a lot of different things: engineering, tech evangelism, a little bit of tech marketing, and blogging. Recently I’ve started doing some product management work around a set of services that we're building for Cloud Foundry that involve Spring Cloud and Netflix OSS.
InfoQ: I know you recently wrote a book published by O’Reilly, which was given away at the conference about cloud-native architectures. What is the exact takeaway of cloud-native from an architect or developer viewpoint, for instance?
Matt: So the thing that I'm calling “cloud-native” is the convergence of a lot of different ideas that also happen to coincide with a lot of companies trying to move to the cloud. The ideas include DevOps, continuous delivery, microservices, agile infrastructure, Conway’s Law, and reorganizing companies around business capabilities. There are a lot of things going on, and cloud-native happens to be a common name that I and some others are applying to it. If there's one takeaway from the book, it is that the vast majority of what I talk about is around people, organizational structure, and culture change. And there's a little bit of technology sort of mashed in there with it.
It's kind of strange to be talking about cloud-native architecture and spending all of your time talking about people, but it really is a fact that this type of architecture really only works if you change your company to facilitate it working. And when you do so, then the architecture is going to give you a lot of benefits to go along with sort of that reorganization. So I think the main takeaway is that this trend is not solely or even principally a technology trend. It's primarily a cultural and organizational change trend.
InfoQ: Okay. I think that's an excellent segue into my next point,- is microservices fundamentally new? Is it SOA all over again, old wine in a new bottle as some characterize it?
Matt: I've gotten involved in these conversations about how microservices relate to SOA a little bit. And to a certain extent, I think it's not a terribly useful conversation. With that said, there are a lot of aspects of what we're now calling microservices that sound very similar to SOA when compared to the first several paragraphs of SOA’s Wikipedia page. I think the real difference is in how SOA was monetized by vendors. Their focus was normally on putting everything into this new piece of middleware called an Enterprise Service Bus that was replacing all of the other large pieces of middleware that were no longer in vogue to sell. Not to say that ESB technology is bad; it was the way that we were using it, replacing one big monolithic thing with another monolithic thing; taking all the complexity from here and shoving it into there. None of that was actually required to make a move to a more service-oriented architecture.
If you look at just the principle of services that have a clear focus, a clear contract, and a clear API – in other words, what the principles behind SOA originally were intended to be – it matches up quite well with what the technical principles behind microservices are also intended to be. The thing that you really didn’t see though, with the SOA trend, was the emphasis on culture, organization, and even operational concerns.
I had this idea of operational concerns in my mind that really crystalized for me when I was listening to Neal Ford talk about microservices. He made the point that when he looked at these architecture diagrams that involve the ESB, they really looked nice, and they were easy to understand, but they didn’t really do anything from an operational perspective.
So we went and tried to do continuous delivery with that. We ran up against all of these roadblocks that were unexpected because nobody really considered what would it take to run this thing in production. You look at microservices and it's one of the first architectures that emerged after the DevOps and continuous delivery conversations happened. And now you have a bunch of teams that are now doing work with that conversation in the backs of their minds and you see these specific types of architectures popping out. These are the architectures that we've now started applied this label of “microservices” to.
So I think principle-wise there's not a lot of difference between the two, but implementation-wise there's a significant difference between the two. I think that's sort of where the conversation is coming from. From the perspective of the principle and the history behind SOA and microservices, they aren’t that different. But when you actually get into what are people doing as a job day in and day out to implement microservices versus SOA, they look entirely different.
InfoQ: Now, a classic example of a thing that failed on the SOA side is this monolithic UDDI. I don't know if you remember that, the Universal Discovery and Directory Interface? Service Discovery is still very critical even in the microservice perspective, right?
InfoQ: So how important is it and what is happening in the microservices world?
Matt: Yes, so thanks to you sending me the questions in advance, I was able to go look UDDI up to remind myself of what it was. I never actually used it myself when it was popular. Of course I don’t know that it was ever popular, but it was a thing that people were talking about. I think the service discovery that we're talking about now is a great deal different. As I understand UDDI, it was almost like this phonebook and I, as a developer of some service, was going to go to this phonebook looking for something I needed such as a credit card processing service.
So my first question would be, “well, what credit card processing services are available out there?” Based on the UDDI’s answer, I'm going to select one. And in selecting one, I'm also going to discover the contract definition for that service in terms of some WSDL document. I am going to take that WSDL, and I’m probably going to use it to generate some code that's going to allow me to interact with the credit card processing service.
What you end up with is this illusion of loose coupling to these services in a directory, but you're actually tightly coupled to a specific contract in terms of that WSDL. If the WSDL changes, now all of the sudden the code that you have generated is no longer valid, and you have to constantly go through this loop.
So when I look at what we're talking about today in terms of service discovery, the directory idea is still there, but not so much in terms of discovering what services are “out there.” More so we know that we need something, and we know it's present in the architecture, and that it has a logical name associated with it. I just need to know how that logical name translates into an actual network address that I can start to communicate with.
So here’s typical example of code that I’ve worked through in this space: I'm building a store service and a customer service, and I know that the customer service needs to use the store service. It just doesn't know the store service’s location. I don't want to hardwire that, because I'll tightly couple these services together. But by having a service registry out there, I can decouple the fact that I know I need this component from where it actually resides. And with that I gain the ability to do things with the store service: I can scale it out; I can scale it in; I can move it around (as I sometimes like to do in a cloud architecture). But then, I never break the customer service in the process.
The other thing is that we've completely decoupled the discovery of the contract for the service from the service registry itself. You look at Netflix;s Eureka, you're not going to find any WSDL or specification for what an API looks like. Now you might get the address of a service and connect to it, and it might tell you “here's what my API is,” or you might do that through some other means. But those are two separate concerns, and we've pulled those apart; whereas with UDDI, they were much more coupled together.
So yeah, again, if you go to a very high level of thought, just as with SOA and Microservices, there are some similarities between Service Discover and UDDI. However the implementations are quite different.
InfoQ: I think you have alluded to some of the microservices that Netflix is working on. I know that you're working on Spring Cloud as well, right? So can you talk about some of the synergy between Netflix OSS and Spring Cloud in general and some of the components that you're looking at in particular?
Matt: There's an increasingly close and growing relationship between Pivotal and Netflix, especially between the Spring team and Netflix, but it's continuing to percolate across other parts of Pivotal. Netflix has long been a user of Spring technologies. They've become a huge proponent of and user of Spring Boot, and Netflix is writing a significant number of applications using Spring Boot today. They were also big users of Grails before Spring Boot was born.
We have, in building Spring Cloud, had numerous conversations between the Spring and Netflix engineering teams – understanding how these things work and how they're properly consumed. It's not like we just went and picked up the open source, wrapped it up ourselves, and then threw it out there. There was a lot of collaboration to make sure that these things were done well. And there continue to be even deeper talks around what are ways that we can take the next generation of what we're doing with Spring and the next generation of what Netflix is doing with their cloud technologies and bring those even closer together. So I would expect to see even more exciting things coming out of cross-pollination between these two groups in the coming months and years.
InfoQ: Okay. There are so many things to kind of get all of them right from a developer perspective, and what I mean by that is that there is service discovery and then there is of course the usual -- I have to scale, I have to connect to logs and all that, right?
InfoQ: Obviously, a PaaS like Cloud Foundry is going to help you? But can you elaborate a little bit more on kind of how the microservices approach and the PaaS approach or at least the Cloud Foundry approach are similar or are they not?
Matt: You rewind even only nine months ago, and we really weren’t talking about this at all. There was a microservices conversation that was sort of out there and nascent, and there was also this Cloud Foundry conversation – what does it look like in terms of developer productivity, operational effectiveness, and generally doing things well. We were at this point where we were talking about forklifting and taking legacy workloads and moving them to the cloud. This was problematic in many ways, hard, and in some cases impossible. Often you didn't want to do it.
And these questions would come up: “Okay, well what does an application that is built to run on Cloud Foundry look like?” And I really started to think about that question. How do you answer it? And for a while, the answer was Twelve Factor. Twelve Factor came out of Heroku. We started using their buildpack model in Cloud Foundry. Twelve Factor was a good fit for Cloud Foundry. You can almost say that Cloud Foundry is tuned to run Twelve Factor applications. But even that tells you some things about maybe the implementation details of how the app works. But when you say, "I'm going to design an app that will run well," you could build a Twelve Factor app that's quite large. Is that what we're trying to get to? Probably not.
So I started looking at microservices. There's this synergy, almost a symbiosis, between the two. Microservices have a great deal of operational challenges associated with them that you cannot ignore. As it turns out, the exact same operational challenges are the things that Cloud Foundry was intended to address. And then you look at Cloud Foundry: some apps don't run well there, some apps do. You look at microservices and how they're typically constructed, and from a Cloud Foundry perspective it's almost like hand in glove.
You don't have to have one to have the other, but when you bring them together there's a powerful synergy that results. And so, I took as a challenge to go out and start this conversation. So at the last Cloud Foundry Summit, I proposed a talk: “Cloud Foundry and Microservices: A Symbiotic Relationship.” It was accepted, and I discussed exactly this topic. It was very well received and bootstrapped the next several months, eventually causing us to really dive head first into this conversation.
Simultaneously, the Spring Cloud project started happening quite independently of that, and eventually we brought these two groups together. Now it's full speed ahead with Cloud Foundry, Microservices, Spring Cloud, and Netflix. Our goal is to build a cloud-native application platform and development frameworks to create a full stack solution for building these architectures, harnessing (at least from a technology perspective) everything that you need to get this done. So what's left is what we can't solve with technology: the cultural and organizational challenges.
InfoQ: One of the things the industry is kind of going wild on containers, right? Would you characterize Spring Boot -- I would call Spring as a lightweight container but I don't know if Spring Boot is a lightweight container. What is your perspective on containers in general?
Matt: When we're talking about containers, we're usually talking about Linux containers. I assume you’re talking about Java containers. Unfortunately container is a very overloaded term. At any rate, everybody's talking about Docker. So what's the relationship between Docker and Spring Boot?
So it's very easy for me to create a Spring application that's a Boot application. Boot does me the favor of embedding the servlet container that I need to run my web app, and I can choose from Tomcat or Jetty or Undertow.
So now I have a JAR file that's self-contained that can do everything that I need. It’s very easy to take that and a Docker container image that has Java and then create a new layer that has my Spring Boot application. And so now I have this portable container image that's has my Spring Boot app that I can ship. I think there's a nice synergy there. If you couple that with the coming ability to run Docker or Rocket images (or whatever) on Diego via Cloud Foundry or Lattice, and you have some very powerful tools.
From a continuous delivery perspective, it’s wonderful to be able to say that the thing that I ran on my laptop, the thing that I shipped through my continuous integration pipeline, and the thing that I'm now running in production on Cloud Foundry are all the same thing. One of the core tenets of continuous delivery is that differences matter – even small minute differences can be the difference between it works here and it doesn't work there. And so knowing that these things -- what I developed, what I tested, and what I'm running at production -- are all identical down to the root file system, give me greatly increased confidence that things will actually work.
The key difference is that obviously there was a kernel here, and there was kernel there; presumably those are the same or very close. But that's the only thing that should be different, and that's not something that we were able to achieve easily only a couple of years ago. It was very difficult to get even close to that. We could approximate it with virtual machines, but those were relatively expensive and unfortunately still had fairly long turnaround times. Now I can do this with containers in seconds.
Now there're still a lot of problems that you have to solve. How do you add scale? How do you manage the creation of and updates to these images? Because if you're not good at container image hygiene, making sure that your images descend from a common root (as it makes sense), what you can do is create a mass of images that don't share much at all between them. You can fill up disk space and create a lot of inefficiencies there. It all looks deceptively easy. I can get a container image created and running in five minutes. It's a getting started exercise. As you move beyond that, there’s a whole lot of work that needs to be done as a team to create a disciplined process and a set of tools around creating and managing these images in a consistent and repeatable way so that as different teams are shipping apps, we don't all have our own bespoke root file system, and we actually benefit from the promised efficiencies that container images can give us.
I'd say that there's no free lunch anywhere. These things are all powerful tools, but none of them are free. You have to learn some things and change the way you do some things in order to use them effectively.
InfoQ: InfoQ audience are practitioners, architects who heavily practice Agile. I think we've started with Agile. So here's the last question we'll end with. Does Agile make sense from an architecture perspective? Sometimes people say just keep improvising and do Agile, right? How does architecture and Agile fit in?
Matt: I think there is a sense in which Agile has ceremonies, and sometimes the ceremonies obscure the heart of Agile, which is iteratively responding to feedback. We don't do big planning. We don't do big design. We don't do big architecture. We see that, and if we’re not careful, we can go from one end of the spectrum, that of heavy, up front planning and design, or waterfall, or whatever you want to call it, and we can fly all the way to the other end of the spectrum and say we don't think about anything. We don't design anything. We don't brainstorm anything. We get behind the wheel and we just hit the gas and go, go, go, code, code, code, test, test, test. And we believe that, eventually, the right thing's going to pop out. Neither extreme is a good one.
Rich Hickey gave a talk a few years ago that got a lot of people fired up (link to the talk on InfoQ here: http://www.infoq.com/presentations/Simple-Made-Easy). In one portion of the talk he likened test-driven development to something he called “guardrail-driven development.” He said, "I don't keep my car on the road by constantly banging it against the guardrails and expecting them to keep me on the road." A lot of people thought he was saying that test-driven development was bad or stupid. You shouldn't work this way. You shouldn't do TDD. I'm not going to speak for Rich, but I don't think that was his point at all. I think the point was that there's still room for thinking. In fact, thinking is essential -- thinking about architecture, thinking about design, having an idea of where you're trying to go, and then using practices like TDD as a tool to help you get there.
So it's okay to pause for a moment from writing tests and code and start to think about, okay, where are we going? How do want this module or API to look? What are the concerns that we're going to have to deal with that are either hard for us to write a test for or that we don't even know how we might write a test for? So maybe we need to step back, get in a room, get around a whiteboard, and draw everything out. But then we don't turn that drawing into heavyweight formal UML documents. It’s entirely possible to do lightweight architecture and design. And it starts by simply realizing that the work begins in your head before it flows out of your hands onto the keyboard.
I think you find success when you find the appropriate place on that spectrum for your team. That location is largely based in the thing you’re trying to build. What is it’s inherent complexity? If you're building a relatively small application, you might not need that much architecture. But for building a massive distributed system composed of microservices that are running at "web scale" and trying to support millions of users, well, you probably can't just TDD your way there.
So I think that just like with everything in software engineering, there's no binary answer to this question. It's not this or that. There's a right answer for your team between these two extremes. You have to find that answer in your context for the problem that you're trying to solve; just how much architectural practice do you need, just how much documentation do you need to actually get where you want to go. And I think that if you just back up and think of Agile as keeping the feedback loop going – getting feedback, responding to it, course correcting, and building the right thing – then the rest of this stuff tends to work itself out.
As long as continuous improvement, continuous feedback, and continuous response to that feedback are at the heart of what you're doing, you can do architecture and you can do pairing, and TDD. You can do any of these things, and you’ll find the right balance for your particular situation.
About the Interviewee
Matt Stine is a technical product manager at Pivotal. He is a 15 year veteran of the enterprise IT industry, with experience spanning numerous business domains. He is the author of recently published Migrating to Cloud-Native Application Architectures from O'Reilly, available as a free e-book download.