InfoQ Homepage Operations management Content on InfoQ
-
Improving Incident Management through Role Assignments and Game Days
John Arundel, principal consultant at Bitfield Consulting, shared his thoughts on how to ensure incidents are handled smoothly and quickly. He suggests assigning specific roles to each team member responding to the incident. Red team versus blue team exercises can also be leveraged to ensure the team is prepared to respond accurately and quickly.
-
Failure Modes and Building Resilient Systems: Adrian Cockcroft at QCon SF
Adrian Cockcroft recently shared his thoughts on how to produce resilient systems that operate successfully in spite of the presence of failures. At the recent QCon San Francisco event, he also shared what he considers are good cloud resilience patterns for building with a continuous resilience mindset.
-
DataOps and Operations-Centric Data Architecture
Eric Estabrooks from DataKitchen spoke at this year's Data Architecture Summit 2019 Conference about how DevOps tasks should be managed for data architecture. DataOps is a collaborative data management practice and is emerging as an area of interest in the industry.
-
OpsRamp Releases Improved Alert Correlation and Better Insights into Event Management Models
OpsRamp, a SaaS platform for datacenter operations management, announced its Fall 2019 release which includes a number of enhancements to its intelligent event management and correlation machine learning models. This release also includes multi-cloud infrastructure monitoring capabilities, synthetic monitoring, and a custom integration framework.
-
Bringing VMware Environments to Azure, Microsoft and VMWare Establish Partnership
At the recent Dell Technologies World conference, Microsoft and VMware announced an expanded partnership that enables certified VMware cloud infrastructure to run in Microsoft Azure. The Microsoft first party capability is made possible through a solution provided by CloudSimple, a VMware certified partner, and officially is called Azure VMware Solution by CloudSimple.
-
Infrastructure Automation Company Chef Commits to Open Source
Chef, an infrastructure automation company, has committed to developing all of their software as open source under the Apache 2.0 license.
-
OpsRamp Announces Improved Service Centricity, AIOps and Cloud Monitoring
OpsRamp, a service-centric AIOps software-as-a-service (SaaS) platform for the hybrid enterprise, has announced new topology maps, enhanced artificial intelligence for IT operations (AIOps) features and new monitoring capabilities for cloud native workloads.
-
Amazon Introduces CloudFormation Drift Detection
In a recent blog post, Amazon announced CloudFormation Drift Detection which organizations can leverage to automate configuration consistency across AWS cloud resources. The CloudFormation Drift Detection feature allows organizations who have templated their configurations and deployments, known as stacks, to detect when configuration drift occurs from out-of-band changes.
-
Amazon Releases a New Session Manager in AWS Systems Manager
Amazon released a new Session Manager in the AWS Systems Manager. This new session manager will provide a new of way of shell-level access to EC2 instances. IT Administrators can now use a new browser-based interactive shell and a command-line interface (CLI) to manage their Windows and Linux instances.
-
IT Operations Is the Most Predictable DevOps Differentiator Says Damon Edwards at DOES18 London
InfoQ spoke to Damon Edwards, co-founder and chief product officer, at Rundeck at DevOps Enterprise Summit London about his talk ‘Operations - The Last Mile Problem for DevOps in the Enterprise’ and the sneak preview of the new version of RunDeck, V3.0.
-
Avoiding Alerts Overload from Microservices: Sarah Wells at QCon London
At QCon London, Sarah Wells presented “Avoiding Alerts Overload from Microservices”, and cautioned that developers and operators must fundamentally change the way they think about monitoring when building a microservice system. Key takeaways included: build a system that can be supported; focus on ‘stuff that matters’ when creating monitoring and alerts; and cultivate and improve alerts.
-
Operational Data Stream and Batch Processing at Netflix with Mantis
Operational Data Stream and Batch Processing at Netflix with Mantis
-
New Security Capabilities Available in Azure Operations Management Suite
On February 25th, 2016 Microsoft announced updates to their Operations Management Suite (OMS). The updates, in this particular iteration of the service, are focused on the security and audit portions of the suite and target the user experience, additional capabilities and features.
-
Blameless Post-Mortems
Blameless post-mortems of production incidents are increasingly seen as an essential fixture of any organisation's procedures. Mathias Meyer, from Travis CI, shared how blameless post-mortems had a profound effect on him. InfoQ took this opportunity to have a look at post-mortem practices of organizations like Etsy, GitHub or Chef.
-
IDC Study: How Many Software Developers Are Out There?
IDC has published the “2014 Worldwide Software Developer and ICT-Skilled Worker Estimates” document, a study estimating the number of professional software developers, hobbyist developers and Information and Communications Technology (ICT)-skilled workers in the world at the start of 2014. The 90 countries covered in the study represent 97% of the world’s GDP.