Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles Combining DataOps and DevOps: Scale at Speed

Combining DataOps and DevOps: Scale at Speed

Key Takeaways

  • DataOps is all about streamlining the processes that are involved in processing, analyzing and deriving value from big data.
  • Development teams need to learn how to look past the data delivery mechanics and instead concentrate on the policies and limitations that control data in their organization.
  • Uber and Netflix have both been very open about the way in which they use DataOps within their businesses models.
  • Many forward-thinking companies have found themselves in the midst of the transition from a product-based economy to a service-based economy.

Over the past decade, hundreds of organizations have made the shift to adopt the cloud as a way to obtain access to its automated, scalable, and on-demand infrastructure. The shift has changed software development requirement timeframes from weeks to mere minutes.

Around the same time, the cloud's scalability has also encouraged organizations to look at new development models. DevOps and the cloud have, together, broken down the walls between people and technology.

DevOps and continuous delivery processes have become widespread in most of our industries, enabling enterprises to radically increase the integrity, constancy, and output of new software.

Organizations are rushing to advance to the latest and best technological advancements. New strategies are being implemented through data-driven decision making and the infrastructure needed to integrate new breakthroughs - from artificial intelligence (AI) to machine learning and automation - is easily accessible.

But even in a world where software has become lightweight, scalable, and automated, there's one thing that prevents organizations from truly shining - and that is how readily their development teams can actually access their data.

In order to move quickly, development teams need consistent access to high-quality data.

If it takes days to refresh the data in a test environment, teams are caught in a difficult position: move slightly slower, or make concessions on quality at the detriment of your customers, subscribers, or users.

DataOps and DevOps - A Better Understanding of How It Works

DataOps is really an extension of DevOps standards and processes into the data analytics world. The DevOps philosophy underscores consistent and flawless collaboration between developers, quality assurance teams and IT Ops administrators. DataOps does the same for administrators and engineers who store, analyze, archive and deliver data.

To put it another way, DataOps is about streamlining the processes involved in processing, analyzing and deriving value from big data. This aims to break down the silos in the data storage and analytics fields which have historically isolated different teams from each other. Improved communication and cooperation between different teams will lead to faster outcomes and better time-to-value. DataOps is a way to automate the data processing and storage workflows in the same way that DevOps does when creating applications.


DevOps combines IT / Ops and developers to work closely together in order to deliver software of a higher quality.

DevOps works in a simulated environment, and due to the radical advances of cloud-based developments, we can witness how organizations are now moving DevOps to their cloud environments. With additional continuous integration and automation of testing and delivery, DevOps breaks complicated tasks into much simpler ones.


Adopting DevOps will require multiple alterations to your infrastructure. To make the most of DevOps, you'll want to move to a microservice-based workflow that benefits from containers and other progressive technologies - hence the massive rise in Software-as-a-Service (SaaS) offerings specifically due to the prior rise of DataOps. SaaS appeals to a massive entrepreneurial demographic, since almost anyone with knowledge or skills can help build a SaaS company.

DataOps also includes administrators and engineers to make use of next-generation data technology to develop their data storage and analytics infrastructure. They need solutions for data processing that are scalable and readily available - think cluster-based, robust storage.

The DataOps architecture also needs to be able to handle a number of workloads in order to achieve the same versatility as the DevOps implementation pipeline. Creating a data management tool set of diverse solutions, from log aggregators such as Splunk and Sumo Logic and Big Data Analytics applications such as Hadoop and Spark, is crucial to achieving this agility.

Embracing the Changes

We need to step away from organizing our teams and technologies around the tools we use to manage data, such as application creation, information management, identity access management and analytics and data science. Instead, we need to realize that data is a vital commodity, and to put together all those that use or handle data to take a data-centric view of the enterprise.

When building applications or data-rich systems, development teams learn to look past the data delivery mechanics and instead concentrate on the policies and limitations that control data in their organization, they can align their infrastructure more closely to enable data flow across their organization to those who need it.

To make the shift, DataOps needs teams to recognize the challenges of today's technology environment and to think creatively about specific approaches to data challenges in their organization. For example, you might have information about individual users and their functions, data attributes and what needs to be protected for individual audiences, as well as knowledge of the assets needed to deliver the data where it is required.

Getting teams together that have different ideas helps the company to evolve faster. Instead of waiting minutes, hours or even weeks for data, environments need to be created in minutes and at the pace required to allow the rapid creation and delivery of applications and solutions. At the same time, companies do not have to choose between access and security; they can function assured that their data is adequately protected for all environments and users without lengthy manual checks and authorisations.

When done correctly, DataOps provides a cultural transformation that promotes communication between all data stakeholders. Data management will now be the collective responsibility of personnel, database managers, and development developers, as well as security and compliance officers. And although Chief Data Officers track data governance and efficiency, they seldom take any interest in non-production needs.

Innovation fails when no-one takes charge of cross-functional data management. Nevertheless, companies can ensure that confidential data is secure through powerful collaborative data systems and that the right data is made accessible to the right people, whenever and wherever they need it. Right through from the engineers who supply the data to the data scientists who analyze it, to the developers who check it.

The next ten years is set to reshape the face of computing as IoT devices, machine learning, augmented or virtual reality, voice computing, and more become common across all industries. With this change, more data, more privacy and security concerns and much more regulation will come into play.

This will put extraordinary pressure on organizations, and whoever comes up with solutions first will reap the benefits. With DataOps, IT can overcome the expense, sophistication and risk that comes with the management of data to become a business enabler while users can get the data they need to unlock their innovative capacity. If DevOps and cloud had been the key enablers of today's digital economy, DataOps is set to be the generator of our future data economy.

Uber and Netflix Show Us the Way Forward

While the way in which DataOps is implemented will be different in every organization, it can be instructive to look at the way in which the concept has been applied in real-world contexts. Two of the most pioneering companies in this regard have been Uber and Netflix, both of whom have been very open about the way in which they use DataOps within their businesses models.

Uber, for instance, uses a machine learning model (ML) known as Michelangelo to process the huge amounts of the data that the firm collects, and to share this across the organization. Michelangelo helps manage DataOps in a way similar to DevOps by encouraging iterative development of models and democratizing the access to data and tools across teams. This system also makes use of a number of bespoke tools - one called Horovod, which coordinates parallel data processing across hundreds of GPUs, and Manifold, a visualization tool that is used to assess the efficacy of ML models.

Netflix is also a company that processes huge amounts of data every day, and one in which these data need to be accessible to thousands of individual clients. The core of the Netflix user experience is their recommendation engine, which currently runs in Spark. However, the company is continually testing new models in order to improve data availability and the accuracy of the recommendations that their ML algorithms provide.

Unlike many companies, however, Netflix runs these tests offline, and only deploy new models in consumer-facing systems once they have been proven to be effective. In other words, they are conscious to ensure the balance between stability and flexibility that characterises effective DataOps approaches.

Why Dataops and Devops Are a Match Made in Heaven

Today's millennial consumers are more aware of their brand engagement and not only want great products but also want great customized experiences when using these products. Many forward-thinking companies are therefore in the midst of the transition from a product economy to a service economy.

For example:

  • Android and iPhone integrate customer support in their product bundle
  • When buying a new vehicle, BMW includes daily car maintenance in the buying price
  • Our smartphone technology now includes food delivery, maps, GPS and even online banking as a service with their product delivery

This shift from product to service as a priority is also reflected in the delivery of software, enabling companies to provide innovation, speed, reliability, frequency and operation on the customer's behalf.

With cloud automation, companies are now able to shift their focus and assimilate user experience seamlessly from machine-based functions to IaaS (infrastructure-as-a-service), PaaS (platform-as-a-service), and SaaS. DevOps helps this by removing the discrepancy between development and support.

We Can Shift from Stability to Agility                               

With an increase in production speed, the industry has been challenged to adjust their go-to-market strategy but mostly to shift their focus from stability and efficiency to innovation and flexibility. Faster technology innovations result in shorter production stages, creative designs and higher delivery rates.

The emergence of social media marketing and future technologies are shifting control away from production and keeping customers or users at the core. Branding and marketing mechanisms now react to consumer preferences rather than unlocking it. From SMEs to start-ups, companies need to encourage and support creative responsiveness and focus on waste reduction.

It's time for IT organizations to enable software as a service with the aid of DevOps methodologies and Cloud automation. DevOps combined with cloud helps to assess the quality of the customer's experience. This cross-department and cross-functional cooperation strengthens an organization's operations and helps them achieve the advantage in their market.

One thing digital transformation has taught us is that software and hardware have to work in unison. Each corporation must adapt to the combination of digital applications with material systems or components. While DevOps offers advancements in software development and ongoing efficiency to its users, the cloud offers simplicity in the use of, and quality in the product by optimizing operational performance. As a result, DevOps in conjunction with Cloud fulfills user expectations with the help of sophisticated execution.

About the Author

Sam Bocetta is a former security analyst, having spent the bulk of his as a network engineer for the Navy. He is now semi-retired, and educates the public about security and privacy technology. Much of Sam’s work involved penetration testing ballistic systems. He analyzed our networks looking for entry points, then created security-vulnerability assessments based on my findings. Further, he helped plan, manage, and execute sophisticated "ethical" hacking exercises to identify vulnerabilities and reduce the risk posture of enterprise systems used by the Navy (both on land and at sea). The bulk of his work focused on identifying and preventing application and network threats, lowering attack vector areas, removing vulnerabilities and general reporting. He was able to identify weak points and create new strategies which bolstered our networks against a range of cyber threats. Sam worked in close partnership with architects and developers to identify mitigating controls for vulnerabilities identified across applications and performed security assessments to emulate the tactics, techniques, and procedures of a variety of threats.

Rate this Article