BT

New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

Brian Goetz Speaks to InfoQ on Pattern Matching for Java

| by Michael Redlich Follow 7 Followers on Sep 27, 2017. Estimated reading time: 9 minutes |

On their continuing quest to make Java more terse, the language architects at Oracle are exploring pattern matching as a new language feature. Brian Goetz, Java language architect at Oracle, and Gavin Bierman, programming language researcher at Oracle, spoke to InfoQ about the prospect of incorporating pattern matching into the Java programming language.

Motivation

The motivation for this research is to improve upon some common Java programming idioms. Consider the following:

    
if (obj instanceof Integer) {
    int intValue = ((Integer) obj).intValue();
    // use intValue
    }
    

There are three operations at work:

  • A test to determine if obj is of type Integer
  • A conversion that casts obj to type Integer
  • A destructuring operation that extracts an int from Integer

Now consider testing against other data types within an if...else construct:

    
String formatted = "unknown";
if (obj instanceof Integer) {
    int i = (Integer) obj;
    formatted = String.format("int %d", i);
    }
else if (obj instanceof Byte) {
    byte b = (Byte) obj;
    formatted = String.format("byte %d", b);
    }
else if (obj instanceof Long) {
    long l = (Long) obj;
    formatted = String.format("long %d", l);
    }
else if (obj instanceof Double) {
    double d = (Double) obj;
    formatted = String.format("double %f", d);
    }
else if (obj instanceof String) {
    String s = (String) obj;
    formatted = String.format("String %s", s);
    }
...
    

While the above code is commonly used and easily understood, it can be tedious (repetition of boilerplate code) and provides a number of places for bugs to hide. The excess boilerplate also tends to obscure the business logic - for example the casting seems unnecessary and repetitive once instanceof has already confirmed the data type of the instance.

Goetz and Bierman explain the overall aim of their proposed improvements:

Rather than reach for ad-hoc solutions, we believe it is time for Java to embrace pattern matching. Pattern matching is a technique that has been adapted to many different styles of programming languages going back to the 1960s, including text-oriented languages like SNOBOL4 and AWK, functional languages like Haskell and ML, and more recently extended to object-oriented languages like Scala (and most recently, C#).

A pattern is a combination of a predicate that can be applied to a target, along with a set of binding variables that are extracted from the target if the predicate applies to it.

Goetz and Bierman have been experimenting with various patterns that include new keywords such as matches and exprswitch.

Operator matches

A proposed matches operator would eliminate the need for an instanceof test. For example:

    
if (x matches Integer i) {
    // use i here
    }
    

The binding variable, i, is only used if the variable, x, matches an Integer. Expanding the above to include the other data type within an if...else construct eliminates an unnecessary casting.

Improvement on switch

Goetz and Bierman explain that “the switch statement is a perfect “match” for pattern matching.” For example:

    
String formatted;
switch (obj) {
    case Integer i: formatted = String.format("int %d", i); break;
    case Byte b:    formatted = String.format("byte %d", b); break;
    case Long l:    formatted = String.format("long %d", l); break;
    case Double d:  formatted = String.format("double %f", d); break;
    default:        formatted = String.format("String %s", s);
    }
...
    

The above code is much easier to read and is much less cluttered. However, Goetz and Bierman point out a limitation of switch - “It is a statement, and therefore the case arms must be statements, too. We’d like an expression form that is a generalization of the ternary conditional operator, where we’re guaranteed that exactly one of N expressions are evaluated.”

Their solution is to propose a new expression statement, exprswitch.

    
String formatted =
    exprswitch (obj) {
        case Integer i -> String.format("int %d", i);
        case Byte b    -> String.format("byte %d", b);
        case Long l    -> String.format("long %d", l);
        case Double d  -> String.format("double %f", d);
        default        -> String.format("String %s", s);
        };
...
    

A summary of the patterns proposed by Goetz and Bierman include:

  • Type-test patterns (binds the cast target to a binding variable)
  • Destructuring patterns (destructures the target and recursively match the components to subpatterns)
  • Constant patterns (matches on equality)
  • Var patterns (matches on anything and binds their target)
  • The _ pattern (matches on anything)

Goetz spoke to InfoQ about pattern matching.

InfoQ: What kind of community response have you received since publishing your paper?

Goetz: We've received very positive responses; people who have used pattern matching in other languages really like it, and are happy to see it coming to Java. For folks who've not seen it before, we expect there will be more of an education effort as to why we think this is an important thing to add to the language.

InfoQ: How heavily has the design of Scala's match operator influenced the design so far? Are there any specific things that Scala's match can do which Java pattern matches will not be able to do?

Goetz: Scala is just one of the many languages that has informed our idea of what pattern matching in Java should be. Adding a feature to a language is rarely a matter of "porting" it from some other language; if we did that, it would look like a bag nailed on the side. We will likely end up in in a place where we don't do all the things Scala's pattern matching does -- and also doing some things that Scala's does not.

We think that there's an opportunity to integrate pattern matching much more deeply into the object model than Scala has. Scala's patterns are effectively static; they can't be easily overloaded or overridden. While that's still very useful, we think we can do better.

Deconstruction is the dual of construction; just as OO languages give you choices for how to construct objects (constructors, factories, builders), we think having the same choices in deconstruction will lead to richer APIs. While pattern matching has historically been associated with functional languages, we think it has a rightful place in OO too -- one that just has been historically overlooked.

It is easy to get excited about language features, but we think language features should primarily be enablers for better libraries -- because there's so much leverage in having a rich ecosystem of libraries. And pattern matching will definitely enable us to write simpler, safer libraries.

As an example of an OO library that never realized it needed pattern matching, consider the pair of methods in java.lang.Class:

        
    public boolean isArray() { ... }
    public Class getComponentType() { ... }
        
    

The second method has a precondition -- that the first method return true. Having the logic of an operation spread out over multiple API points means complexity for the API writer (who has to specify more, and test more) and complexity for the user (who can more easily get things wrong.) Logically, these two methods are one pattern, which fuses the applicability test of "does this class represent an array class" and the conditional extraction of the component type. If it were actually expressed as such, it would be easier to write, and impossible to use incorrectly:

        
    if (aClass matches Class.arrayClass(var componentType)) { ... }
        
    

InfoQ: Is it a goal or a non-goal to provide implementation technology that would allow Scala to rebase its match on top of the Java version under development (in a similar way to how we saw Scala 2.12 rebase traits on top of interfaces)?

Brian Goetz: Just as with Lambda, we expect that, in the course of designing this feature, we will identity sensible building blocks that can go into the underlying platform, that multiple languages can benefit from -- and provider greater interoperability between multiple languages.

InfoQ: In the Scala implementation, significant amounts of additional synthetic bytecode are generated to support features like destructuring of case classes. Does adding a similar amount of VM and bytecode machinery here have any drawbacks?

Goetz: The risk is that the compiler is stepping outside its normal role, and assigning semantics to class members that are usually under the control of the developer. While this is convenient, it may not always be what the user wants -- which often leads to requests for dials and knobs for tweaking the generation (e.g., comparing arrays with Arrays.equals() rather than Object.equals()).

InfoQ: Will destructuring be limited to data classes?

Goetz: We plan to roll out pattern matching over multiple releases, starting with simple patterns like type tests, then destructuring patterns on data classes, and eventually arbitrary user-written destructuring patterns. So while we definitely do not intend to limit destructuring to data classes, there may be some window of time when that is true.

InfoQ: Can you speak to the relationship between data classes and value types?

Goetz: The two are almost completely orthogonal. Values are about aggregates that have no identity; by explicitly disavowing identity, the runtime can optimize in-memory layout, flattening away indirections and object headers, and more freely cache value components across synchronization points. Data classes are about disavowing complex, indirect relationships between a classes representation and its API contract; by doing so, the compiler can fill in common class members like constructors, pattern matchers, equals, hashCode, and toString. A class might be suitable for being a value class, or a data class, or neither, or both; all combinations make sense.

InfoQ: Sealing requires some support from the source code compiler.

Goetz: Sealing (like finality) not only requires support from the compiler, but ideally would get support from the JVM too -- so that language-level constraints like "X can't extend Y" are enforced by the JVM.

InfoQ: Is the intention for sealing to mean something like "may not be subclasses outside this module"?

Goetz: There's a range of what sealing could mean. The simplest form would mean "can only be extended by other classes declared in the same source file" -- not only is this a common interpretation of sealing, but it is also extremely simple because things cannot get out of sync via separate compilation. One could also define sealing to mean "in the same package" or "in the same module"; one could get even crazier and allow lists of "friends," or complex runtime predicates. We will almost certainly land on the simple end of this spectrum; the return-on-complexity beyond the simplest interpretation drops off pretty quickly.

InfoQ: Will the new six-month Java release cycle better facilitate integrating pattern matching into the language?

Goetz: We hope so! We already have identified sensible "chunks" into which pattern matching can be divided, so that we can roll out simple pattern support relatively quickly, and keep building on that.

InfoQ: Will a prototype be available for pioneer users at some point?

Goetz: It already is -- for early adopters who are willing to compile a JDK from source. There's a branch in the "Amber" forest with support for type test patterns in switch and a "matches" predicate.

InfoQ: What’s on the horizon for your pattern matching research?

Goetz: We're still exploring how we want to surface matchers as class members, and the questions surrounding how they play into overloading and inheritance. There's lots to figure out there.

Resources

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

"There's lots to figure out there" by David Smith

"There's lots to figure out there" -- that last statement speaks volumes. Taking an idea from other languages that is semantically simple and has been around for decades, and turning it into a research project should raise some flags.

Why do we continue to expand a platform that is as out-dated and bloated as Java. Is it the inertia of the tech industry, or is it the lack of an obvious successor? Probably a bit of both.

Sure, there are a lot of languages out there; the most I've seen in my 25+ years in software development. Which might be a big part of the problem -- it's hard to get to consensus.

Groovy does this already by Matt F

First off, I love Java and the associated ecosystems.

My 2 cents:

Groovy already does this via the instanceof operator; I don't particularly see why "matches" is needed on top of this. I personally favour Kotlin and/or Groovy over Java when suitable. Those languages already contain many of these types of syntax and seem to be driving the adoption of this sort of thing in Java proper. Competition is great for everyone.

In my experience, most often but not always, if you have code that needs to match on types then you probably have some design problem in your code.

Re: Groovy does this already by Michael Redlich

Hi Matt:

Thanks for your thoughts on this. I totally understand what you're saying. I believe that DSL languages (Kotlin, Groovy, etc.) that compile to the JVM were designed for more ease of use.

But, Java has been around for over 22 years and it appears as if it's not going away any time soon. As-a-matter-of-fact, this is indeed a very exciting time for Java with the release of Java 9 (and all it's new features) along with Java EE going open source, among other things.

If Brian Goetz and his colleagues at Oracle can improve Java with new features such as pattern matching, then I believe this is a good thing.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

3 Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT