BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Fault Tolerance Content on InfoQ

Articles

RSS Feed
  • Implementing Microservicilities with Quarkus and MicroProfile

    Microservicilities is a list of cross-cutting concerns that a service must implement apart from the business logic. These concerns include invocation, elasticity and resiliency, among others. This article describes how Quarkus and MicroProfile may be used to implement these concerns.

  • Designing Chaos Experiments, Running Game Days, and Building a Learning Organization: Chaos Conf Q&A

    The second Chaos Conf event is taking place in San Francisco over 25-26 September. In preparation for the conference, InfoQ sat down with a number of the presenters, and discussed topics such as the evolution and adoption of chaos engineering, key people and process learning from running chaos experiments, and what the biggest blockers are for mainstream adoption.

  • Resilient Systems in Banking

    Resilience is about tolerating failure, not eliminating it. To build a resilient system, you must build a system that absorbs shocks, and continues or recovers. Following best practices for resilient architecture, including established cloud patterns, allowed Starling Bank to build a bank, from scratch, in a year, against a backdrop of highly public outages amongst incumbent banks.

  • Service Mesh: Promise or Peril?

    Service meshes such as Istio, Linkerd, and Cilium are gaining increased visibility as companies adopt microservice architectures. The arguments for a service mesh are compelling: full-stack observability, transparent security, systems resilience, and more. But is a service mesh really the right solution for you? This article examines when a service mesh makes sense and when it might not.

  • Six Tips for Running Scalable Workloads on Kubernetes

    Tips to ensure Kubernetes knows what is happening with your deployment: where best to schedule it, when is it ready to serve requests and ensuring work is spread across as many nodes as possible.

  • A Comparison between Rust and Erlang

    This article will focus on a comparison between Erlang and Rust, detailing their similarities and differences. It may be interesting to both Erlang developers looking into Rust and Rust developers looking into Erlang. A final section will detail more about each of the language capabilities and shortcomings and argue for the possibility of leveraging both languages' strengths in the same project.

  • When Streams Fail: Implementing a Resilient Apache Kafka Cluster at Goldman Sachs

    At QCon New York, Anton Gorshkov presented “When Streams Fail: Kafka Off the Shore”. The talk shared insight into how a platform team at a large financial institution design and operate shared internal messaging clusters like Apache Kafka, and also how they plan for, and resolve, the inevitable failure that occurs.

  • But is it Safe?

    While it is rare to hear the question, "Is this software safe?", the safety aspects of software are becoming increasingly important. The proliferation of IoT devices increases the widespread impact a small problem can cause. Several techniques exist to help developers analyze and improve the safety of software they create.

  • Storm Applied Review and Q&A with the Authors

    Storm is a distributed, fault-tolerant, real-time computation system that was originally developed at BackType and later open sourced by Twitter. Storm Applied is a new book from Manning that aims to provide a practical guide on using Storm, both in a development and in a production setting. InfoQ has spoken with two of the book’s authors, Sean T. Allen and Matthew Jankowski.

BT