BT

Under the Hood with a Prototype of Enhanced Generics for Java

by Ben Evans on Dec 29, 2014 |

Java 8 was only released this year, and Java 9 is scheduled for mid-2016. Despite this, some interesting details of features planned for a future version of Java (hopefully Java 10) have emerged.

Specifically, two big features have begun prototyping: Enhanced generics and value types. Enhanced generics are the feature that should allow future Java developers to write code involving, e.g. List<int> without the pain of boxing of primitive types. However, the design of the proposed new form of generics contains some subtleties that have to be approached carefully, as Brian Goetz explains in his recent design paper.

Java has always had a focus on backwards compatibility, and under Oracle's stewardship, that viewpoint has been reaffirmed. For this reason, Oracle are pursuing a tactic similar to that used for the introduction of generics in Java 5 - an approach they refer to as "gradual migration compatibility".

The basic design problem that needs to be overcome is that Java's type system does not have a unified root. There is no type in Java that is a supertype of both Object and int. This can be seen in the structure of JVM bytecode - not least in the fact that the bytecodes for returning an int versus an object from a method are different opcodes - ireturn is different from areturn.

The current prototype uses an approach called "any" type variables, to indicate that the type variable can range over both reference types and primitives (and also over the proposed new value types). This is currently written as Container<any T> but the syntax is still a work in progress, so may well change before the feature ships.

The current thinking is that while List<Integer> and List<String> will continue to be represented at runtime by List.class (so type erasure still occurs for reference types), List<int> will be represented by a different runtime type (and potentially by a different classfile). This approach is called "generic specialization" for the primitive types. It also helps with another design issue - upgrading existing collection classes to use enhanced generics. A key design goal is to allow developers to have List<int>, so there must be a migration path for an existing generic type to support any type variables in future versions.

There are also some surprises in terms of how enhanced generics fit into the type system. In particular, List<int> is not a subtype of the raw List type (if it was, then that would imply that List<int> was storing instances of Object). However, List<?> is a subtype of List, so this implies that List<int> is not a subtype of List<?>, and that wildcards do not work for enhanced generics.

The current prototype is a long way from being production ready, and there is much design and implementation work to do. In particular, the implementation of specialization is being actively worked on. Automatic generation of specialization code is desirable (as it reduces hand-written boilerplate), but this may require additional support in the bytecode and classloading subsystem. One intriguing possibility is the introduction of a metaprogramming facility at virtual machine level (but no direct Java language support). This approach is being referred to as "classdynamic" by analogy with invokedynamic and is described here.

The development of enhanced generics and value types is being conducted through Project Valhalla, and more details can be found there.

Rate this Article

Relevance
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Lack of Reified Generics by Luke deGruchy

I'm really looking forward to value type and obviously generic types that can contain value types, but
I believe choosing not to introduce reified generics is a mistake. The Ceylon guys have shown that it can be done: ceylon-lang.org/blog/2013/02/21/reification-fin...

OMG... by N T

This has been part of Scala for years, and we're still talking in a (*number* *of* *years*) we will have this in Java...! Well, to be honest, I feel totally fine with this, as I've left Java behind already.

Not good by Gavin King

So this proposal, as it stands today, would make Java worse, and I hope that the Java community gets together to send a clear message to Oracle that it is unacceptable.

Java's type system today suffers from two major problems:


  • primitive types are a special case in the type system that can't easily be abstracted over, and

  • the system of generic types, hacked into Java 5 with insufficient review, is much too complex, with weird things like erasure, raw types, a system of variance that few developers understand well, and nastiness related to wildcard capture.


This proposal purports to address the first problem, but in practice arguably makes the situation worse, and definitely makes the second problem worse.

In the guise of making it easier to abstract over primitives, it makes it impossible to abstract over instantiations of any generic type. We now have things like `List<int>` which aren't `List<?>`s. This is obviously broken, and means we're simply digging ourselves into a deeper hole.

If this is a problem that Java needs to address, then it should do it the right way, by introducing an `Any` type that abstracts over all types in the language. If the Java SE team decides that this is too hard, for whatever reason, then it should simply drop this proposal, which would do more harm that it is worth.

Re: Not good by Luke deGruchy

From the InfoQ article:

"The current prototype uses an approach called "any" type variables, to indicate that the type variable can range over both reference types and primitives (and also over the proposed new value types)."

So this proposal, as it stands today, would make Java worse, and I hope that the Java community gets together to send a clear message to Oracle that it is unacceptable.

Java's type system today suffers from two major problems:


  • primitive types are a special case in the type system that can't easily be abstracted over, and

  • the system of generic types, hacked into Java 5 with insufficient review, is much too complex, with weird things like erasure, raw types, a system of variance that few developers understand well, and nastiness related to wildcard capture.


This proposal purports to address the first problem, but in practice arguably makes the situation worse, and definitely makes the second problem worse.

In the guise of making it easier to abstract over primitives, it makes it impossible to abstract over instantiations of any generic type. We now have things like `List<int>` which aren't `List<?>`s. This is obviously broken, and means we're simply digging ourselves into a deeper hole.

If this is a problem that Java needs to address, then it should do it the right way, by introducing an `Any` type that abstracts over all types in the language. If the Java SE team decides that this is too hard, for whatever reason, then it should simply drop this proposal, which would do more harm that it is worth.

Re: Not good by Gavin King

The current prototype uses an approach called "any" type variables, to indicate that the type variable can range over both reference types and primitives (and also over the proposed new value types).


Right, I should have been clearer that what I'm talking about here is the ability to abstract using subtype polymorphism, not parametric polymorphism. To recap: Java today supports two sorts of polymorphism: subtype polymorphism and parametric polymorphism. At the intersection of these two facilities is variance, provided by wildcards. This proposal adds parametric polymorphism for primitive types, without addressing the "original sin" of Java's type system which is the lack of subtype polymorphism for primitives. Which means that we get a completely broken behavior for wildcards. Wildcards are already problematic enough in Java, and this makes them worse, not better.

OTOH, one could arrive at a much simpler proposal by simply saying that there is a type that is a supertype of both primitives and reference types, Any, or whatever you want to call it, and then parametric polymorphism for primitives simply follows from that much much more naturally. That's the way all other languages solve this problem. It's much simpler, much more elegant, and much less confusingly WTFy for everyone. Sure, this approach might require some enhancements to the VM, but I don't think that's too much to ask.

Re: Not good by Ant hony

How would your much simpler, much more elegant proposal satisfy the "gradual migration compatibility" requirement as explained in the paper?

Re: Not good by Gavin King

Since the Any type does not currently exist, it's introduction would by need to change the semantics of any existing code.

Re: Not good by Gavin King

Great: two typos in one sentence, partly thanks to autocomplete, and InfoQ doesn't let me edit comments on my phone. It's -> its, by -> not.

Re: Not good by Ben Evans

I don't believe it's possible to introduce an Any type without changing the semantics of existing code - and that is a red line for Oracle. Without a major policy shift from the top there will never be a backwards-incompatible change.

Also, to me, this issue goes deeper than I think I made clear in the article. Consider that Java 8 lambdas are still nominally typed. Without some additions to the type system we aren't going to be able to get away from Java's very nominative typing. Personally, I'd be happy to have wildcards restricted in the way Brian outlines if it means we get closer to proper function types.

Re: Not good by Gavin King

I don't believe it's possible to introduce an Any type without changing the semantics of existing code


Well, that's an assertion for which you provide no evidence. And I don't see any reason why it should be true.

Personally, I'd be happy to have wildcards restricted in the way Brian outlines if it means we get closer to proper function types.


I don't see how these two problems are connected. Ceylon has proper function types, with abstraction over arity, and a single root to its type system. So it's clearly possible to have both at the same time.

Re: Not good by Gavin King

P.S. Ben, you can very easily prove your above assertion by providing a code example whose semantics would change after introduction of Any. I can't think of one, but of course it's possible that I'm missing something.

Re: OMG... by Pierre Carrier

No it hasn't been.
Scala collections other than Array (which maps to JVM primitive arrays) box primitive types.

It can actually be a lot harder to avoid boxing in Scala than it is in Java.

Re: Not good by Ben Evans

Gavin,

How would envisage the semantics of:

List<?> list = new ArrayList<>();
list.add(null);

working with the addition of an Any type? I can't see a good (or even reasonable) choice, but maybe you can. <></?>

Re: Not good by Gavin King

Ben, there would be a distinction between List<? extends Object> and List<? extends Any>. The add() method of List<? extends Any> would not accept null (or any other value). The add() method of List<? extends Object> would continue to accept null as it does today.

But that's indeed a great point, with null being a hole in Java's type system, we need to to careful about how it's handled w.r.t. variance.

Re: Not good by Ant hony

How would you envision that distinction in the add() methods to be made, and by whom (developer/compiler/VM)?

Also, I don't see how the Any type would be useful at all at a language level (i.e. for developers to use). Why would I want to write something like List<? extends Any> or declare a field of type Any? (and if I would, what would the default value of such a field be?)

PS: what I find striking, is that you pose your idea as "the" solution to this problem:
If this is a problem that Java needs to address, then it should do it the right way, by introducing an `Any` type that abstracts over all types in the language.

The people working on project Valhalla have been pondering over this issue for months. Heck, even "normal" Java developers like me would be able to come up with "introducing a common type" as a solution to generics over primitives. So to me it's obvious that the Valhalla people have thought of this long ago already, and are currently not viewing it as a viable solution.</?>

Re: Not good by Gavin King

How would you envision that distinction in the add() methods to be made, and by whom (developer/compiler/VM)?


By the type system. I'm just describing what is essentially the usual behavior for a contravariant location in a covariant instantiation of a generic type.

Why would I want to write something like List<? extends Any>


For the exact same reason you would use the type List<?> - namely, to abstract over different instantiations of the type List. For example, you could form a heterogeneous collection of different instantiations of List: Collection<List<?>>.

Heck, even "normal" Java developers like me would be able to come up with "introducing a common type" as a solution to generics over primitives.


Which is precisely why it's the right way to solve the problem. It's intuitive instead of unintuitive. In general, it's always better to do the more intuitive thing unless there is some really compelling reason not to. So far, I don't see any reason not to, in this case.

It happens, occasionally, that the obvious solution is the wrong solution. But it also happens, very often, that the most obvious solution is actually the best one.

Note that other languages that descend from Java, including Scala and Ceylon have single-rooted type systems, precisely because that's perceived to be the more correct thing to do. And therefore a secondary reason for preferring Any is that, at least in principle, it could result in much better interop between these three languages, and others. As it is, what's being proposed for Java is not only unnatural for Java, but, I speculate, also much more difficult for other JVM languages to interoperate with or reuse.

So to me it's obvious that the Valhalla people have thought of this long ago already, and are currently not viewing it as a viable solution


Well sure, it's possible that I'm wrong, and that there's a really compelling reason why the "obvious way" can't work. But if that's the case then surely someone can explain why it can't possibly work. Waving your hands and just asserting that something isn't viable doesn’t convince someone who thinks it is viable. FTR, I have basically begged for an explanation in the valhalla ML, and the response was that it’s too much trouble to explain the reasoning. That doesn’t inspire confidence that the reasoning is very sound.

Re: Not good by Ben Evans

So the semantics of List<?> is that it remains really List<? extends Object> and that we have new syntax, say List<_?_> which is actually shorthand for List<_?_ extends Any> and all existing code continues to be understood as implementing the original type bound that we had?

If that's the case, then we still have serious problems with raw types:

List<?> is a subtype of List
List<int> is a subtype of List<_?_>
List<Integer> is a subtype of List<?> and List<_?_>

but

List<int> is NOT a subtype of List<?>
List<_?_> is NOT a subtype of List (for the reasons Brian outlines)

so the type system for generic types is no longer single-rooted.

This problem really isn't very easy to solve, and I don't think that the horrible mess that wildcards & the need to respect existing semantics (especially with respect to raw types) cause when trying to introduce an Any type is better than what's been proposed by Brian and co.

Re: Not good by Gavin King

Ben, as I've written elsewhere, I wouldn't try to make List<int> or List<Any> a subtype of the raw type List.

Raw types are a broken thing that exist in Java purely for the purpose of backward compatibility with pre-Java-5 code, and should never be used in new code. (Indeed, it would be best to deprecate them.) They're totally unsound, even in their current incarnation. (I can assign a List<String> to List, and then put an Integer in it.)

So, sure, raw types are totally broken and can't really be incorporated into any generic type system in a sensible way. I don't think that should be any sort of influence on the design of the non-broken bits of generics. We can pretty clearly distinguish a subset of generics (i.e. generics minus raw types) that isn't broken, and try to keep it that way.

Re: Not good by Ben Evans

Gavin,

Well, so now I think we've reached the point where we're approaching the same position from slightly different sides.

List<?> is a subtype of List and that cannot be changed without breaking existing semantics. Raw types are also used more extensively that we might like, and they aren't about to go away (at least, not in Java code).

If Any is a supertype of Object, then that implies that the typing relationship between List<?> and List<Any> is going to be problematic at best.

Java's type system has problems and everyone knows that. But I disagree that it is possible to distinguish clearly between "broken" and "non-broken" generics. If nothing else, the holes caused by covariant arrays are going to remain regardless, along with the issues caused by null values.

Re: OMG... by N T

I urge to look up the annotation @specialized which allows you to create customs primitive collections. The way out works under the hood, is like C++ templates (creates a new class when the type is used). This is similar to what this clone proposal looks like ;)

Re: Not good by N T

+1 this type of class hierarchy would indeed make very little sense and I would be opposed to it.

Re: Not good by Gavin King

If Any is a supertype of Object, then that implies that the typing relationship between List<?> and List<Any> is going to be problematic at best.


No, not really. List<Any> and List<int> are subtypes of List<? extends Any>, but not of List<? extends Object>. That's completely sound.

Re: Not good by Ben Evans

But in that case you don't have a single rooted type system for generic types - because List<?> and List<Any> have no typing relationship to each other, so I don't see the value of the Any type as compared to Brian's proposal.</any></?>

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

23 Discuss
General Feedback
Bugs
Advertising
Editorial
Marketing
InfoQ.com and all content copyright © 2006-2016 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT