Neal Gafter Discusses Closures, Language Features and Optional Typing
Recorded at:

Interview with Neal Gafter by Ryan Slobojan on Aug 11, 2008 | NOTICE: The next QCon is in New York Jun 9-13, Join us!

Bio Neal Gafter is a software engineer and Java evangelist at Google. He was previously a senior staff engineer at Sun Microsystems, where he designed and implemented the Java language features in releases 1.4 through 5.0. Neal is coauthor of "Java Puzzlers: Traps, Pitfalls, and Corner Cases" (Addison Wesley, 2005).


1. Hi, my name is Ryan Slobojan and I am here with Neal Gafter. What's coming in Java 7?

Unfortunately it's not my decision; Sun Microsystems is going to be making that decision. I have my own ideas of what I think should be coming in Java 7 but it depends on a couple of big unknowns. One of them is the schedule for Java 7 -- how much time is between now and is there remaining to develop. I am mainly interested in the language features, so I don't have a lot of insight into the rest of the platform. Although I am pretty sure that there will be a lot more support in the VM for dynamic languages, but as far as the rest of the libraries, I don't really know. For the language features, there's a couple of categories of changes mostly for Java 7.

I think we are likely to have super packages, the modularity support. JSR308 I'm not really sure about, mainly because the expert group doesn't seem to be that active recently. I don't know whether that is something that Sun will want to include in JDK 7 or possibly target for JDK 8. There are a bunch of small language features that are sort of language clean-ups, that are worth making things more regular in the language, in the spirit of everything that is already there, just areas where things could be made better, that I think are likely to be included in JDK 7 no matter what the schedule is.

There are a couple of examples where generics can do better type inference, where programmers are currently forced to give type parameters where the compiler could figure them out for you. One when you are invoking a constructor to create an object, and another one when you are calling a generic method, but you are passing it as an argument to another method call or a constructor. In both of those cases, the type arguments for the class that you are constructing and the type arguments to the generic method are not currently inferred by the compiler or, if they are inferred, they are inferred in a not very clever way.

I think the language can be improved to allow the compiler to do a much better job of figuring out what you would have had to put yourself. There are a couple of things about catch clauses that can be improved, the easiest one to explain is one where you can actually catch more than one type at the same time. So rather than having to duplicate a catch clause where the only thing that is different is the type that you are catching, you can write a single catch class that just says "I am catching this or this or this" and not have to repeat the code. And for enumerations the idea would be to overload the comparison operators: less than, greater than, less than or equal to, greater than or equal to.

Right now you have to use the compareTo() method on enums, which is kind of awkward, and because it's awkward sometimes people take the unfortunate step of computing the ordinals and comparing the ordinals which is not so great. So if we are able to compare them directly using the comparison operators, that would be much neater and easier to read because it clearly conveys the intent of what you are trying to do. So those kinds of cleanup, I think we are likely to see a bunch of those kinds of cleanup in JDK 7. As for whether or not we are able to get closures in the JDK 7, I think given the resources that seem to be available inside Sun to support work on the specification, I doubt it. Probably something like that, if we are able to do closures, is more likely to be something like JDK 8. We still don't have a JSR for closures, of course we don't have a platform JSR for JDK 7 either and it has been a couple of years.

So, who knows how long it will be before JDK 7 is launched as a formal JSR and who knows what Sun will say the desired schedule is for JDK 7. If the schedule permits two or three years of development, or Sun is able to add resources to work on the specifications, or we're able to get an expert group for closures ramped up significantly before the platform JSR is launched -- in any of those cases it might be possible to do closures in JDK 7. But I am not very optimistic at this point.


2. For the sets of smaller language changes that you had mentioned, it sounds very similar to something which Joshua Bloch has mentioned in the past. Is it related to that?

Yes, actually this is a list of language features that I had worked on with Alex Buckley, and we ended up talking to Josh Bloch as the Google JCP representative about whether or not Google would be interested in submitting that as a JSR. And the answer is yes, but it hasn't formally been submitted while the conflicts in the EC over TCK license terms are worked out. But the set of issues that he was talking about is exactly the kind of small language features that I think should be considered for JDK 7.


3. And you had also mentioned super packages. Can you explain for us what super packages are?

I am actually not a very good person to answer that question, but from what I understand, let me describe the kind of problem that I've run into that this would help solve. The product I work on at Google is Google Calendar. Our codebase is big enough that it doesn't make sense to put it all in one package because otherwise we'd have one package with hundreds of source files in them, sort of all as one big blob. You can use naming conventions to group them by function, but packages are much better at that.

The idea is to make subpackages to organize things that are related, and for the things that where you need to communicate between those modules your only choice in order to make something accessible for one package to another is to make something public. Well the problem with making something public is it looks like it is part of your public API even though you may have done that just so you can organize your project that way. So one of the things I ran into is, there were some utility methods that I had put in a utility class in the Google Calendar codebase, that does some nice formatting of dates.

And someone in a completely other part of the company needed to format some dates. And he was looking around to see if he could find any code that did the right thing and he said "Oh, what's this, maybe there's something in there I could use". And he found this public class, and he found this public method, and he looked at it and said: "Hey this does exactly what I need". And he started using it, and it worked, and a couple of months went by and we decided to do some refactoring of the Google calendar codebase and re-organize things. And it broke his product. We didn't really intend it to be a shared utility, it was intended to be a private part of Google Calendar but there is no way of expressing that, there is no way of saying "This is public for you guys, but not for you guys", or "It is public just within this realm". Super packages provide a way of expressing this hierarchical relationship between packages that are there just for implementing some other package, and whose public pieces should not be exported for use outside of some enclosing package. And I think it would have solved this problem for me, or it rather would not have allowed the problem to come to exist for me. He would not be able to compile against that method if we had encapsulated Google Calendar in a super package, and we said this utility is private, then it would have not been possible for someone in a completely separate codebase to use those methods.


4. The other major thing that you had mentioned is closures. Can you describe what closures are?

Sure, closures are sort of an umbrella word to talk about a group of related features in the same way that generics is not a thing, but it describes a group of related features. The most important is anonymous functions; it's an expression that's an anonymous function. Computer science theory folks would call it a lambda expression and it is actually not a new idea at all, it dates back to the theoretical work from the 1930s and the lambda calculus and the most important early implementations were in the Scheme programming language and in Smalltalk.

And we have closures or lambda expressions or anonymous functions in many many programming languages today, almost all of the dynamic languages, all of the functional programming languages, pretty much almost every language that has been introduced in the past ten years, for example, has something like closures. So the idea is, it is a function, it is an expression that designates a function, it identifies the parameters, and you say here is the code of the function, it could be statements or it could be just a result expression.

And the reason it's called a closure is it can also use variables from the enclosing scope and those are referenced inside the closure, and you save the closure for later and when you use the closure later it refers to that variable. So it encloses the state of that variable even if they are local variables. Because they are in scope where the closure is written you can use them from the closure even much later. And usually languages that support closures have to have support of a garbage collector. And Java does have support of a garbage collector, so it's a natural thing to do.


5. There has been some debate in the community around the assorted closures proposals that have been put forward. What are the differences between the different proposals?

There are three proposals that people have talked about much recently. One of them is Concise Instance Creation Expressions. It's really a shortening of the syntax of anonymous inner classes. I wouldn't consider it closures, because the thing that you are creating is not a function, and it's not lexically scoped, you can't use variables from the enclosing scope unless they obey a certain set of restrictions, and that's the only part of the scope that they really close over properly. You end up inheriting names from elsewhere.

And this is one of the key features of closures, is that they are lexically scoped, that the names are resolved in the enclosing scope. Another one is… Stephen Colebourne has a proposal called First-Class Methods, and actually over time his proposal and my proposal have gotten very close to each other in a lot of ways. The major differences at this point are the precise syntax that you use to write a closure, the syntax of the proposal I am working on is much closer to the syntax that is used in many other languages. Stephen has a syntax that he made up for Java, and that he feels better matches the syntactic history of Java.

I don't happen to agree, but there is a sense in which the syntax is not that important, I mean it certainly is important in the sense that people will be faced with it every day, it needs to be something that they are able to read, but the semantics are also very important. And what I mean by that is it''s important that you have a language feature that allows you to do certain things, whether you write it using square brackets or angle brackets is not as important as having the ability to use the language in that way But it is important that it is readable. Another difference between Stephen's proposal and the proposal I have been working on is the meaning of return within a closure.

In the proposal I have been working on it is lexically scoped, just like everything else is lexically scoped in a closure. In Stephen's he has most of the things lexically scoped except for the return statement which isn't lexically scoped. And I believe that actually causes a problem of being able to use closures to do certain kinds of code refactoring. There was a vote recently on about it, and there's actually a lot of people who think the time is not ripe to do closures immediately, but of the people that voted, more people voted for the closures proposal I have been working on than any of the alternatives. However, voting is not the right way to design a language, of all of the ways that you can do it, it's probably the worst way.

The collective consensus on the set of things that people want and don't want, doesn't necessarily result in a consistent language. While every person might vote in a way that could result in a consistent language, the collection of votes doesn't necessarily result in something that''s consistent. And even worse, very often people are voting based on imperfect information or an imperfect understanding of what the alternatives really mean and what the impact would really be to a programmer. So, I think most programming languages have one person or a small number of people that form a core team that guides the design of the language, and the evolution of the language. Stroustrop for example for C ++ and Anders for C# and I think that Alex Buckley and James Gosling could play that kind of role for Java.

But currently, there is so much confusion around what might happen in JDK 7, and people are blogging their language ideas… I mentioned three closure proposals, there are actually at least a dozen, some of them don't really make sense at all and some of them make more sense than others, and many of them are just minor variations on each other and I think it would be really nice to have, in particular about syntactic issues, to have an expert group look at the syntactic alternatives and decide which syntax works best. I actually don't think either the syntax I currently have or the syntax that Stephen currently has is probably likely to be what we end up with for closures. Maybe something like Groovy but I don't know. I'm sure that this is something they will be capable of coming to a consensus about. There is more disagreement on whether or not we should do something in the language or not than about precisely what characters we should used in the syntax.


6. One of the things that you have discussed in the past is the idea of optional typing. Can you describe that in more detail?

Well, an optional typing system is… This is something you would have as… You start with a dynamically typed language, and what you add is you add annotations, optional annotations, type annotations, to the language. And you have a type checker that produces not error messages but warnings that tell you when it''s able to infer that you are doing something that violates the rules. And the reason for having something like this is not because you can use the type information to optimize the program, because these type annotations might not hold at runtime, you are expressing things for the purpose of getting assistance in checking properties that you are declaring about the program. So having them be optional means that you can improve your type system over time.

You can improve the things that you can express in these annotations over time, without really changing the language. You simply add more annotations, or you improve your ability to check the annotations over time and strengthen the type checking, but the behavior of the program has not changed at all. One of the advantages of a type system like this is any static type system has limitations. There are certain things that are correct and provably correct, but there is no way of expressing them in the type system, so you either do something like use Object in Java, and put in lots of casts in your code, or you end up duplicating an API over and over and over again, for different types.

There are a lot of things you can do. In a language with dynamic typing you simply solve the problem once, and if your static type system or if your annotations are not powerful to express the properties of that, then you don't and maybe you strive to improve the quality of what you can express in these type annotations. But they don't get in the way of programming, they don't prevent you from expressing something where you know how to solve the program algorithmically, but there is no way of shoehorning a description of the solution into the type system so that you can prove to the compiler that it is correct. And optionally-typed systems sort of give you a mix of the productivity of a dynamic language, while giving you the additional static checking that you get in a statically-typed language.


7. And are you able to make the same kinds of inferences that you can with a statically-typed language when you have the compiler doing optimization passes? It can make certain assumptions. Is it still possible to do such a thing with an optionally type language or are you restricted to doing that more at runtime, for instance?

Well, in a dynamic language you want your runtime to be exploiting type information at runtime anyway, whether those type annotations are present in the source or not. If for example you declare that something is a string in a type annotation, your VM should take advantage of the fact that it actually is a string whether or not you wrote that annotation. So, you shouldn't need those annotations to affect the runtime behavior of your program if you have a VM, a good VM. And you can't always… Especially because you can't fit everything into every type system, there will typically be parts of your program that are un-typed, which means there are parts of the system that are not checked, which means that not every type annotation will actually be true all the time.

And if you try and depend on those, of course you would be undermining the correctness of your program because you are lying to the VM. So I think that exploiting the type information at runtime is something the VM should do, ignoring the annotations, and telling you about the correctness of your program based on the annotations, based on type checking, is something that should be done based on the static annotations that you put in your program and there is not a lot of reasons for those two to mix.


8. What are your thoughts on what the next language will look like?

I don't think that there will be a single next language in the next few years. I think that we will see a lot of experimentation in the next few years but I am not sure the time is ripe for a single obvious winner to emerge. Or if there is, I don't see what it is. Personally, as far as a successor to the space that Java fills, I think Scala--. There are a lot of really good ideas in Scala, but I think there are too many ideas in Scala. I don't know exactly what that would look like though, and I am not sure that there is someone working on that right now. And in the dynamic language space I actually don't know. Ruby seems to be doing really well, Groovy is on an upward trend, but it's on an upward trend from a very low position, so it is hard to tell where that will go if anywhere. But I don't think I have any particular insight. But it is nice to be working as a software engineer at a time when so many options are being explored. It certainly makes things interesting.


9. One of the comments was that a language can be successful so long as it resists the temptation to innovate. What are your thoughts on that?

Well… Let me put it this way: when Java was first released upon the world, most of the people who looked at it seriously were C and some C++ programmers. And it was seen as a huge innovation, and I don't necessarily mean that in a positive sense. People were saying: "Are you serious that you expect me to use garbage collection in a production context? And a VM instead of a compiler?" There were a lot of things that were tried-and-true in the sense of having been around for a long time and proven themselves from a software engineering perspective, but simply were not widely deployed, things that have been in the universities, but not out in the production world. In retrospect we can say that the things that Java added to C were not new, were not inventions, were not brand-new ideas that hadn't been around before. They were ideas that had proven themselves already but had not been widely deployed. There are a lot of ideas that have proven themselves but have not been widely deployed, that are not in Java, in fact there are ideas that have proven themselves and been widely deployed and yet are not in Java. We're probably more likely to be successful adding things that are not being added to a language for the first time. So I would largely agree with his comment.

General Feedback
Editorial and all content copyright © 2006-2013 C4Media Inc. hosted at Contegix, the best ISP we've ever worked with.
Privacy policy