Inside the Complexity of Delivering Cloud Computing
Computing is still more of an art form than science even in this early part of the 21st century. However, the reasoning behind this seems to elude even the most knowledgeable of practitioners. The most likely cause for this is that IT looks at the final product, the application and the data, through many lenses. The three blind men and the elephant analogy is very apt for describing this phenomena. In this analogy, three blind men each touch a different piece of the elephant and, therefore, each derives that the elephant is something different.
In IT, we require many different skills and resources to come together under the guise of a system to meet a business requirement. Infrastructure operations manages the facilities, hardware and software required to run the application. Engineering is required to create the fabric of components needed to provide an end user with a set of capabilities. Security is required to analyze the environment to make sure that risks are mitigated regarding loss of data, breach and availability. The production environment must be monitored and users need a support infrastructure to contact when something is not working as planned. All of these pieces need to come together in a cohesive manner to deliver just one application environment, let alone the thirty or forty that most businesses require.
Needless to say, deploying and operating applications and computing infrastructure for an enterprise is a daunting task. Now, add to this the ability to manage a shared pool of resources that can scale to meet multiple workload’s demands and the problem becomes exponentially more complex. Here’s a basic summary of all the touch points required to answer the mail for delivering a big data solution. The reason big data was selected was because it offers such as rich solution environment that includes multi-parallelism, large volumes of storage, significant network bandwidth, multiple layers of information security and scalable compute infrastructure.
It would be easy to approach the big data problem domain with the same methodology that one would use for building a data warehouse.
- Review the data domain and deliver an information architecture
- Estimate the size of data that will be needed to be stored and increase by 25%
- Design the Hadoop deployment environment to have enough nodes to process the anticipated maximum load job within the expected time frame
- Acquire the necessary hardware and install into the data center
- Deploy Hadoop environment
- Load data into the allocated storage
- Cross fingers and hope business doesn’t ask for more data to be processed or loaded into the environment
This traditional approach produces a one-off application stack dedicated to meeting some business intelligence requirement leveraging big data tools. It also is not extensible, easily reused and allocated capacity to a fixed function that may or may not be used enough to justify the capital expenditure.
Of course, one could also consider using an external cloud service provider for this purpose, but expect that moving large volumes of data, e.g. petabytes, into a public cloud service provider will take considerable logistics and perhaps unexpected costs.
The Cloud Way
So, what does a cloud-based design offer that is far and beyond what can be achieved through the traditional IT approach?
- Elasticity, so when the customers want to add new data elements into the analysis or add more data to incorporate into the results, there is not a race to meet the infrastructure demands to provision and respond.
- Greater leverage of existing resources through virtualization so that resources are not sitting idle when they are not used.
- The ability to deploy the runtime Hadoop environment in an automated manner so that it does not need to remain running when not in use.
- A common set of available services, such as identity management and network security, so that these do not need to be redundantly acquired, configured and managed, thus reducing costly operational overhead.
- Self-service provisioning so that the data scientists and business users can take advantage of this environment on-demand without long lead times.
Achieving these benefits requires appropriate planning and architecture to enable an infrastructure that can meet these goals. The figure below illustrates all the major touch points that must be incorporated in order to answer the needs of a cloud environment that can support big data.
While self-explanatory, it’s important to discuss the effects each of these branches has on getting to the desired outcome.
Network Architecture – Big data can often be limited by insufficient bandwidth as well as has the potential to impact other applications running on the same network backbone. Implementing virtual networking and quality-of-service is important for ensuring that mission-critical applications don’t get underserviced in support of the big data initiatives. Likewise, the network also must support movement of the data into the processing environment in order to be available for processing.
Storage Architecture – There are many options for supporting your big data initiatives that includes using multiple tiers of storage, e.g. high-speed, flash, commodity, etc. With many of these environments operating on petabytes or more of data, achieving scalable storage in an affordable manner is one of the more complex tasks that is being considered today. Moreover, many of today’s networked storage architectures will also become dependent upon your network architecture, so even if you’ve selected a high-speed storage environment, the data may not be able to move in and out quickly enough if you have not implemented a suitable network architecture.
Compute Architecture – Big data environments are highly-dependent upon the number of nodes available to process tasks. The more nodes, the faster the processing can occur. This means that the compute architecture must respond to the scaling needs without starving other applications running in the same cloud environment. This is perhaps one of the more really complex areas of cloud computing. As good as we might be at planning, we eventually all suffer from finite resources. Hence, the compute architecture must provide the appropriate instrumentation to allow IT balance the resource pool to meet the demands of the consumer audience.
Security – Today’s NoSQL and big data environments don’t often offer fine-grained security that can be found in traditional relational database environments. This means that alternate security approaches must be implemented within the cloud environment in order to ensure that data is not compromised.
Service Management – This is a major component in delivering the goals described above. Service management covers the processes and methods for capacity management, configuration management, monitoring, automation, orchestration, provisioning, operations and support. Without appropriate service management the environment would most likely suffer performance degradation over time resulting in unhappy users and consumers. Additionally, service management entails the development of the service catalogs, which define tiers of services and makes services available to the provisioning processes.
As you can see from this list, cloud computing entails a lot more than simply installing some virtualization software and being able to provision virtual machines. Moreover, this is just one application of cloud computing. Consider how this is made even further complex by the addition of different types of workloads, such as Virtual Desktop, enterprise applications and development environments.
So, What Is The Secret To Cloud?
At the risk of sounding like a broken record, traditional approaches toward IT organizations will have a very difficult time building and supporting a cloud environment. Stratified IT with diverging reporting paths, varying agendas and differing reward models, spells trouble for answering the mail regarding delivery of cloud computing; challenges here both technical and cultural. For example, in a traditional IT department, if a particular application is consuming storage at an above average rate, the answer is for Infrastructure & Operations to add more storage versus exploring the root cause and seeing if the observed effects are appropriate and if they could be corrected in software or configuration.
The most successful firms have adopted a DevOps approach toward delivering IT-as-a-Service (ITaaS). DevOps is a collaborative approach between engineering and operations that shares the responsibility for producing high-quality, sustainable services. As noted above, the infrastructure impacts the application and vice versa, so, it’s critical that there is solid collaboration between these teams exploring these dependencies and then taking the appropriate approaches to mitigating risks.
To quote my InfoQ article entitled, “DevOps: Evolving to Handle Disruption”,
“A DevOps culture is represented by dissolution of the internal hierarchy, a move to meritocracy, breaking down of the stove-pipes, service orientation, service lifecycle management and, most importantly, agility. DevOps cultures thrive in the face of disruption versus bottleneck and stall. They produce the highest quality output with limited human resources. There are no political barriers to sharing information and responding to the inevitable system failures. There is team, where once it was us and them.”
Cloud computing is becoming important to business because of the need to deliver more value with fewer resources in the face of shrinking budgets. When one looks empirically at the cost of operating IT for most businesses, there is a lot of waste that has been built into the existing environments due to both technology capabilities and a lack of focus on strong IT service management. Squeezing out every drop of value out of the existing IT investments is now forcing a change in how we approach delivering applications and data. Cloud computing is merely a vehicle than enables that to be done in a very cost-effective manner with reduced waste while still enabling innovation and agility.
About the Author
JP Morgenthal is one of the world's foremost experts in IT strategy and cloud computing. He has over twenty-five years of expertise applying technology solutions to complex business problems. JP has strong business acumen complemented by technical depth and breadth. He is a respected author on topics of integration, software development and cloud computing and is a contributor on the forthcoming "Cloud Computing:Assessing the Risks" as well as is the Lead Cloud Computing editor for InfoQ.
Not so sure...
I guess it would be easy, would it be right or even traditional? I'm less convinced. I've worked with a number of professional outfits that might be considered traditional that wouldn't do it as you describe. Further they employ a bunch of the techniques you seemingly classify as "cloud".
People seem to have forgotten that virtualization was around long before cloud became a fashion and it was used heavily for better, more effective use of rackspace and multi-core machines. Automation has been possible, and done, by smart people for years, it hasn't taken cloud to make that happen. There are plenty of books that provide advice such as "if you're doing something more than once, automate it". Lastly, decent architects (those that span infrastructure and software) long ago set about creating shared, common services as you describe (it dates back to before the likes of SOA).
Moving on to DevOps, there are now so many definitions of that term, I think it's largely useless but if you look at organisations that have adopted post-mortems and a mature approach to product delivery, you'll find operations and development teams working together and, again, those disciplines have been around long before cloud.
I think it would be fair to say that it's easier to do the things you describe these days which is good for everyone but I feel it is in no way particular to cloud and there are plenty of "traditional" IT departments that have been doing these things for some time prior. That's not to say I'm against cloud as such. Clearly there are benefits such as easier access to datacentre-type resources online for startups etc.
Shane Hastie on Distributed Agile Teams, Product Ownership and the Agile Manifesto Translation Program
Shane Hastie Apr 17, 2015