Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles Tap Compare Testing with Diferencia and Java Microservices

Tap Compare Testing with Diferencia and Java Microservices


Key Takeaways

  • Within a microservices architecture a lot of services might be evolving in (relative) isolation at the same time, and often very rapidly. To get the full value of this architectural style, services must be capable of being released independently.
  • It is often difficult to verify that a new service (or a new version of the service), does not break anything in the current application i.e. cause a regression through a change in API, payload, or response performance.
  • “Tap compare” is a testing technique that allows you to test the behavior and performance of the new service by comparing its results against the old service. This article provides an example of mirroring production traffic across both old and new services, and compares the difference in result.
  • Diferencia is an open source tool (released under Apache License v2) written in Go and tightly integrated with Java test frameworks like JUnit 4, Junit 5 or AssertJ, which allows you to use the tap compare testing technique for validating that two implementations of a service are syntactically compatible.

DevOps has grown greatly in popularity over the past several years, particularly in (software) companies that want to reduce their lead time to be measured in days/weeks instead of months/years, without compromising quality. This has led, among other patterns and technologies, to the adoption of the microservice-based architecture.

In a microservices architecture, a lot of services might be evolving at the same time, and often very rapidly. However, more importantly, they must be releasable independently in an isolated way, effectively meaning that the release is not orchestrated between services.

So, you can release several times per day if you embrace microservices architecture (with all the implications it has), but this raises another problem: it is difficult to verify that a new service (or a new version of the service) does not break anything in the current application.

Let’s see an example where you might break a service because of an update of another service.

The Challenges with Orchestrating Microservice Releases

Suppose we have a Service A (v1), also known as the consumer, and Service B (v1) also known as the provider. Service B (v1) provides as output a JSON document with one field called name, which is consumed and used by Service A (v1).

Now, you create a Service B (v2) which changes the field from nameto fullname. Then you fix all tests of Service B (v2) so they are not failing because of this modification. Because, in theory, any service can be released independently, you deploy this new version to production, and of course, Service B (v2) will behave correctly, but Service A (v1) will start failing immediately because it is not getting the data that is being expected (e.g. Service A is expecting a field namebut receiving a field fullname).

So as you can see, unit test (in the case of Service B, here) and tests in general help in providing confidence in what we are developing doing things right, but this does not cover the aggregate logic of the whole system (i.e. we unintentionally broke the dependent Service A).

A Potential Solution: Introducing Tap Compare

“Tap compare” is a testing technique that allows you to test the behavior/performance of the new service by comparing its results against the old service.

It is used to detect different kind of regressions, for example, request/response format regressions (a new service is breaking backward compatibility with one consumer), performance regressions (a new services behaves slower than the old one), or simply code bugs by comparing the response of both services.

The tap compare approach does not require that a developer create complex test scripts, as is often the case with other kinds of tests, such as integration tests or end-to-end tests. In tap compare you can either use the mirroring traffictechnique or capture (shadow) a portion of public traffic, and replay this against the new version of the service. These techniques are out of the scope of this post, and for the sake of simplicity and as a getting started guide to the tap compare technique, we are going to “simulate” a mirroring trafficapproach with a test.

Why Tap Compare?

Tap compare is not something that attempts to act as a direct substitute for any other testing technique -- you will still need to write other kinds of tests such as unit tests, component tests or contract tests. However, it can help to you detect regressions so that you can feel more confident about the quality of the new version of the developed service. 

But one important thing about tap compare is that it provides a new layer of quality around your service. With unit tests, integration tests, and contract tests, the tests verify functionality based on your understanding (as a developer) of the system, so the inputs and outputs are provided by you during test development. In the case of tap compare, this is something totally different. Here, the validation of the service occurs with production requests, either by capturing a group of them from the production environment and replaying them against the new service, or by using the mirroring traffic technique where you shift (clone) production traffic to be sent to both the old version (production version) and to the new version, and you compare the results. In both cases, you as a developer do not need to write the test script (providing inputs or outputs) for validating the service -- it is the real traffic used for validation purposes.

Tap compare works within the “production environment”; you are using production traffic and production instances to validate the new service which is also deployed to the production environment, therefore you are adding a quality gate within the production environment, whereas other testing techniques are focused on verifying the correctness of software before it is deployed (i.e unit or component tests).


What is Diferencia?

Diferencia is an open source tool (released under Apache License v2) written in Go and tightly integrated with Java in frameworks like JUnit 4, Junit 5 or AssertJ, that allows you to use the tap compare testing technique for validating that two implementations of service are compatible (e.g. that a service does not break backward compatibility regarding the interaction protocol), and increase confidence that the changes are regression-free.

The idea behind Diferencia is to act as a proxy, with each request that is received being multicasted to multiple versions of running services. When the response from each of the service is returned back, it then compares the responses and checks if they are “similar”. If after repeating this operation with a representative amount of different requests, all (or most) of them are “similar”, then your new service can be considered regression-free.

You are going to see in the next section why I am using the term “similar” and not equal.

Diferencia is also delivered as a Docker image (lordofthejars/diferencia) based on Alpine image and ready to be used in Kubernetes or OpenShift clusters.

The version of Diferencia is 0.6.0, at the time of writing this post.

How it Works

Diferencia acts as a proxy between a request and the two versions of a service you are validating. By default, Diferencia uses two different service instances: 

  • Existing version (the one in production) known as primary.
  • New version (the one under release process) known as candidate.

Each request is broadcast to both of them, and the response from both instances is compared. If the responses are equal, the Diferencia proxy returns to the caller an HTTP status code 200 OK. On the other side, if the requests are not equal, then an HTTP status code 412 “Precondition failed” is sent back to the caller. The premise is that the same request with the same parameters should produce the same response. Internally Diferencia also stores the result of each request so it can be queried later.

It is important to note that Diferencia does not behave like a standard proxy, so original content of the service is not returned if it is not explicitly set to make it so. Diferencia can be started in mirroring traffic option which enables Diferencia to send back the response coming from the primary element.

However, this is just the simplest case. What happens when there are some values in the JSON document that are intrinsically different (or nondeterministic), for example, a counter, a date or random number? Although both responses might be perfectly valid, since the only difference is in the value of a field, both documents are not equal, and hence cannot be a guarantee that this change is because of regression or not.

To avoid this problem (also known as “noise”), an automatic noise detection function identifies fields that contain noise as value and removes that noise from the responses. In this way, noise values are removed from comparison logic, and each of the responses can be compared as if there was no noise.

To have automatic noise detection you need three running instances of the service:

  • Existing version (the one in production) known as primary.
  • Existing version (the one in production) that is another instance of primary  known as secondary.
  • New version (the one under release process) known as a candidate.

First of all, primary and candidate responses are compared as it has noise detection disabled. Then the responses from primary and secondary are compared too. Since both versions are the same, the responses should be identical and any difference between them is considered noise. Finally, the noise is removed from the comparison between primary and candidate and it validates that both responses are mutually equal.

It is important to note that by default Diferencia ignores any non-safe operation such as POST, PUT, PATCH, etc, because of possible side effects on the services. You can disable this behaviour by using --unsafe flag.

Diffy or Diferencia

Diferencia is based on the idea of another tap compare framework called OpenDiffy, but there are some differences between them. Diferencia is:

  • Written in Go to offer a lightweight experience in containers.
  • Ready to be used in Kubernetes and OpenShift clusters.
  • It can be used to mirror traffic.
  • Exposes results as Rest API but also in Prometheus format.
  • Integrates with Istio.
  • Supports Postel’s law (more about this later).

Diferencia Java

Diferencia-Java is a wrapper around Diferencia, which gives you a Java API for managing it in Java without noticing that it is implemented in Go. Diferencia-Java provides the next features:

  • Install Diferencia automatically, you don’t need to install anything manually.
  • Start/Stop Diferencia without dealing directly with CLI.
  • Specific HttpClient to connect to Diferencia Rest API to configure it or get results.
  • It can be used as plain Java.
  • Integrated with JUnit4 and JUnit5.
  • Integrated with AssertJ library to make tests readable.

Java example

For this example, a simple Rest API is used to show all the features of Diferencia in an easy way.

The service is developed using MicroProfile spec and it looks like:

public class HelloWorldEndpoint {

    public Response getUserInformation() {
       final JsonObject doc = Json.createObjectBuilder()
           .add("name", "Alex")
       return Response.ok(doc.toString())

Let’s see how Diferencia can be used while this service is evolved into different versions. For the sake of simplicity, next premises are taken:

  • Service runs on localhost.
  • Primary service runs on port 9090.
  • Secondary service runs on port 9091.
  • Candidate service runs on port 9092.

Java Test

For this example, JUnit 5 is used for developing test code, running Diferencia and detecting regressions. Basically, this test replies a list of URLs specified in a file against Diferencia. Finally, it asserts if there are regressions or not.

Next, dependencies must be in your classpath and should be registered in your build tool:


And write a JUnit test that reads URLs from a file:

@DiferenciaCore(primary = "http://localhost:9090", candidate = "http://localhost:9092")
public class DiferenciaTest {

   private final OkHttpClient client = new OkHttpClient();

   public void should_detect_any_possible_regression(Diferencia diferencia) throws IOException {
       // Given
       final String diferenciaUrl = diferencia.getDiferenciaUrl();

       // When

               .forEach((path) -> sendRequest(diferenciaUrl, path));

       // Then


   private void sendRequest(String diferenciaUrl, String path) {
       final Request request = new Request.Builder()
           .addHeader("Content-Type", "application/json")
           .url(diferenciaUrl + path)
       try {
       } catch (IOException e) {
           throw new IllegalArgumentException(e);

When you run this test, a request to/user is sent to Diferencia proxy, which is self-started by the JUnit extension. When all requests defined in the links.txt file are processed, it is asserted that there are no errors in Diferencia proxy, which means that there are no regressions in the new service.

Since now both service instances are exactly the same but running in different ports, everything is fine.

In a more complicated case, this file should be generated as a result of capturing public traffic or by simply redirecting the public traffic to the Diferencia proxy using a mirroring traffic technique. As said previously, this is out of the scope of this post.

Now, let’s try to add a change that breaks backwards compatibility on the new service by changing the name field to fullname.

finalJsonObjectdoc= Json.createObjectBuilder()

           .add("fullname", "Alex")


Then deploy this new version, and running the test again you’ll get that there is a regression on path /user.

It is time to see noise detection in action. Let’s modify both the existing and new service to contain a random number, and deploy them again.

final JsonObject doc = Json.createObjectBuilder()
           .add("name", "Alex")
           .add("sequence", new Random().nextInt())

Run the test again. Obviously, you’ll get a failure because the sequence field contains a randomly generated value.

This is a perfect use-case for automatic noise detection, so you need to deploy a secondary service at port 9091 and enable Diferencia to use noise detection.

@DiferenciaCore(primary = "http://localhost:9090", candidate = "http://localhost:9092",
   config = @DiferenciaConfig(secondary = "http://localhost:9091", noiseDetection = true))

Run the test again, and you will see the green bar. Automatic noise detection identifies that the value of the sequence field is noise, and it is dropped from comparison logic.

So far, you’ve seen that Diferencia can be used for detecting regressions, but there is still an important use case to cover, and this is how to correctly implement a rename of a field in a new version of service without triggering a regression.

Subset Mode

To rename a field in a response, both consumer and provider should follow the Postel’s law or serializing and deserializing messages. Postel’s law says (paraphrasing), “Be conservative in what you send, be liberal in what you accept”.

If you want to rename the field nameinto fullname, you need first to provide both fields so you are not breaking any consumer.

In the previous example, the new version of the service should look like:

final JsonObject doc = Json.createObjectBuilder()
           .add("name", "Alex")
           .add("fullname", "Alex")
           .add("sequence", new Random().nextInt())

Now the consumer is still compatible with the new version, so no regression is introduced … well, let’s deploy the new service and run the Diferencia test. And you get a failure. The reason is that primary and candidateare not equal; the new version has one field that the old version does not have. To fix this false positive, Diferencia has the subset mode. This mode sets Diferencia up to not fail, in case the old version’s response is a subset document of the response of the new document.

Change the test to configure Diferencia to get started in subset mode.

@DiferenciaCore(primary = "http://localhost:9090", candidate = "http://localhost:9092",
   config = @DiferenciaConfig(secondary = "http://localhost:9091", noiseDetection = true, differenceMode = DiferenciaMode.SUBSET))

Run the test again, and you get a green bar again, hence Diferencia can also be used even in these cases for detecting any regression problem.

More Features

In this post, you’ve learned how to use Diferencia with Java, but keep in mind that Diferencia is written in Go, and this means that it can be used as a standalone in any language.

Also, Diferencia gives you the following features:

  • HTTPS support.
  • Exposing results to be consumed by a REST API and/or Prometheus.
  • Visual dashboard.
  • Average time elapsed in primary and candidate calls.

Contract Tests

Tap compare tests are not a substitute for contract tests, but they act as a “guardian”, so that anything not covered by a contract verification test (i.e an operation not specified in the contract), is not able to introduce a regression when the new service is released to production.

It is important to note that contract testing is a technique which requires a substantial amount of knowledge of the technique to implement this effectively (especially in the case of consumer-driven contract development), and all teams of the project must be highly compromised with the technique.

Within contract testing there is a step that involves the generation of the contract, hence we also need a process to automate this, to keep it up-to-date, or to guard against any possible error being introduced during this (potentially) manual step.


Tap compare is a good test technique to add to your toolbox in order to validate that there are no regressions on a new version of a service, without having to curate and maintain a test script. You can either capture existing production traffic and replay it later, or you use the mirroring traffictechnique which clones the request and sends this to both your old and new service version.

In this post, I have focused on Diferencia and its integration with Java, but it can be used as a standalone service and does not require the use of Java (or any JVM language). 

If you want to increase the quality of your application and add a guard to prevent regressions in new releases, then Tap compare is a technique that can help you.

About the Author

Alex Soto is a software engineer at Red Hat in Developers group. He is a passionate about the Java world and software automation, and believes in the open source software model. Alex Soto is the creator of NoSQLUnit and Diferencia projects, member of JSR374 (Java API for JSON Processing) Expert Group, the co-author of the book Testing Java Microservices by Manning, and contributor of several open source projects. A Java Champion since 2017 and international speaker, he has talked about new testing techniques for microservices, and continuous delivery in 21st century. You can find him on Twitter @alexsotob.

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • next errors found

    by Michael Martinsson,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I love the concept behind Diferencia. Have been thinking myself of how to use production data to test. This is very interesting. But...

    I get an error ('next errors found...') when running the first test and can't find the source code for the article or any means of support, discussion etc.

  • Re: next errors found

    by Alex Soto,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Yes I have the source code is at but feel free to ping me on twitter if you find any problem.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p