# Artfully Benchmarking Java 8 Streams and Lambdas

| by Monica Beckwith 581 Followers on Dec 10, 2015. Estimated reading time: 2 minutes |

The first version of Takipi’s recent blog posting on 'How Misusing Streams Can Make Your Code 5 Times Slower' met with criticism about how benchmarking the performance degradation on Java 8 Streams was conducted, which they were quick to rectify with an optimized version of the original benchmark and corrected result attributions.

With the help of Streams and Lambdas in Java 8, Java developers can utilize functional programming style, which is different than programming with traditional For-Loops and Iterators. (InfoQ has previously pitted Java 8’s functional style against imperative style programming in Java.)

Takipi’s blog post tested Java 8’s functional programming features using an example of finding a maximum value in an ArrayList. The test compared functional vs. imperative programming style by implementing Iterators, For-loops, For-Each loops, Stream, parallel Stream, and Lambdas (with and without For-Each). The first iteration of this benchmarking failed to employ a few basic JIT compiler optimizations and in some cases a few optimizations were only employed for some of the test cases while others didn’t reap the benefits of those optimizations.

Following the community critiques, Takipi quickly revised their benchmark (built on JMH - the Java Microbenchmarking Harness). The first major change in the revised benchmark was removal of the volatile keyword for the ‘integers’ field as shown below:

-    volatile List<Integer> integers = null;

+    List<Integer> integers = null;

As Oracle's Java performance engineer Sergey Kuksenko pointed out, the above change helped the JIT compiler to employ the ‘range check elimination’ optimization.

The second major change was the elimination of auto-boxing, since the benchmark would run into auto-boxing issues for Streams. The changes were made at multiple places in the benchmark code. One such change is shown in the code snippet below:

-    Optional<Integer> max = integers.stream().reduce(Integer::max);

-    return max.get();

+    return integers.stream().mapToInt(Integer::intValue).reduce(Integer.MIN_VALUE, Integer::max);

The revised benchmarking highlighted that parallel Stream had a slight edge over the other test cases; but overall the imperative programming style group of tests wins over the functional programming style group when the use case is as narrow as iterating through an ArrayList to find the max value.

InfoQ contacted the blog author, Alex Zhitnitsky and he mentioned:

There's a lot of excitement around the new Java 8 features, which is great, BUT many developers still misuse it. When benchmarking the use cases, we wanted to go for a non-optimized benchmark, since in real life day to day usage, many developers use these features out of the box.

The post shows a specific use case that favors loops, compared with a sloppy yet short/intuitive/quick implementation with streams. So for example:

integers.stream().reduce(Integer::max);

versus

integers.stream().mapToInt(Integer::intValue).reduce(Integer.MIN_VALUE, Integer::max);

The second implementation would be the right way to go by the optimized benchmark, although it's easy to let mapToInt slip and create an auto-boxing issue with the shorter implementation. Both of the benchmarks are correct, since they both measure legitimate implementations - Even though the first version doesn't contain the optimizations (which can be non-intuitive, and longer, like in this mapToInt example)

The takeaway from this benchmarking exercise is that it’s really important to know what you are benchmarking and to profile and compare the generated code especially when employing micro-benchmarks.

Style

## Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

## Get the most out of the InfoQ experience.

### Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Java8 Ducktape

The 2nd incantation is the optimal solution, but I agree with the comments of the original author about developers just picking the "non-optimal" version. No one, presured for time to complete a project will just pick the optimal solution. At some stage the oracle ingeneers should stop slapping more libraries unto the base langauge, and actually optimize the base language. IMO, the compiler should do a performant boxing , unboxing operation. From the developer perspective, the first version is the optimal solution, dammit, why else would any one use compiled language, if not for the fact that it expected that the compiler would do the optimization.

the example is Map-Reduce without Map

the real problem with the example in the benchmark is, that all the work is done in the reduce part but a map part does not exist at all. The main benefit of Java streams as parallel operations is to parallelize the map operation in a thread-safe way without thinking about locking and shared mutable data. This is very much like a map-reduce framework. The reduce part cannot not be 100% parallized, because the data is here merged together. An example with most of the work being in the map operation would be at least 5 time faster in a parallel stream than in a sequential for-loop.

Unbelievable... by Ant hony

I can't believe InfoQ is paying any attention at all to this clickbait post (which, by the way, was first titled "How Using ..." instead of "How Misusing ...").

integers.stream().reduce(Integer::max);
versus
integers.stream().mapToInt(Integer::intValue).reduce(Integer.MIN_VALUE, Integer::max);

While obviously the proper comparison is:
integers.stream().reduce(Integer::max);
versus
integers.stream().mapToInt(i -> i).max();

in which case the second is clearly more intuitive, more readable, more performant, ...

Re: the example is Map-Reduce without Map

Hi Pascal,

Thanks for the comment! I agree that there are better use cases for parallel streams where they can show a bigger performance benefit (but that's not always the case btw: zeroturnaround.com/rebellabs/java-parallel-stre...).

In the Takipi post, they did provide a benefit after all, although not that big. The purpose was to show that sometimes just using loops can be more efficient, even though it might be tempting to use the new Java 8 features. Either way, benchmarking parallel streams is hard to do in isolation since they also depends on how many threads are in play from other parts of the system.
Close

#### by

on

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

4 Discuss

Login to InfoQ to interact with what matters most to you.