BT

Scala, Erlang, F# Creators Discuss Functional Languages
Recorded at:

Interview with Joe Armstrong, Martin Odersky, Don Syme by Sadek Drobi on Nov 18, 2010 |
32:36

Bio Martin Odersky, creator of the Scala programming language, is a professor of programming methods at the EPFL. Joe Armstrong is one of the inventors of Erlang. While at Ericsson he was part of the team who designed and implemented the first version of Erlang. Don Syme, a principal researcher at Microsoft Research, Cambridge, U.K., is the designer and architect of the F# programming language.

The Erlang Factory is an event that focuses on Erlang - the computer language that was designed to support distributed, fault-tolerant, soft-realtime applications with requirements for high availability and high concurrency. The main part of the Factory is the conference - a two-day collection of focused subject tracks with an enormous opportunity to meet the best minds in Erlang and network with experts in all its uses and applications.

   

1. Once I’ve been travelling with Joe Armstrong and he told me that Erlang is the only object-oriented programming language. Can you tell us a little bit more about the conceptual model of it?

Joe Armstrong: Actually it’s a kind of 180 degree turn because I wrote a blog article that said "Why object-oriented programming is silly" or "Why it sucks". I wrote that years ago and I sort of believed that for years. Then, my thesis supervisor, Seif Haridi, stopped me one day and he said "You’re wrong! Erlang is object oriented!" and I said "No, it’s not!" and he said "Yes, it is! It’s more object-oriented than any other programming language." And I thought "Why is he saying that?" He said "What’s object oriented?" Well, we have to have encapsulation, so Erlang has got strong isolation or it tries to have strong isolation, tries to isolate computations and to me that’s the most important thing. If we’re not isolated, I can write a crack program and your program can crash my program, so it doesn’t matter.

You have to protect people from each other. You need strong isolation and you need polymorphism, you need polymorphic messages because otherwise you can’t program. Everybody’s got to have a "print myself" method or something like that. That makes programming easy. The rest, the classes and the methods, just how you organize your program, that’s abstract data type and things. In case that the big thing about object-oriented programming is the messaging, it’s not about the objects, it’s not about the classes and he said "Unfortunately people picked up on the minor things, which is the objects and classes and made a religion of it and they forgot all about the messaging.

Don Syme: Above all. And inheritance very much became an absolutely defining feature.

Joe Armstrong: Then, the nice thing about the messaging is, I heard at QCon Dan Ingalls say it decouples the sender from the receiver. So you get these isolation properties, if you send me a message, I don’t really need to know it came from you, it could have come from anybody. So, I don’t inherit all the kind of mess that you’ve got and we can decouple things.

Don Syme: A 180 degree turnaround reminds me a little bit of turnarounds in other settings, like in Haskell. Sometimes people who talk about Haskell describe Haskell as the world’s best imperative language because it has imperative computations that are sort of first class objects, I/O values, which has now been taken into other languages as well much more that C# and Scala and others. But it’s interesting. You can obviously do imperative programming with Haskell and you don’t think about that necessarily, but it’s got some nice properties for doing it.

Joe Armstrong: I think our languages are different to the extent that the big thing about Erlang is it was designed for doing fault tolerance and the big thing is the messaging and the concurrency. The big thing is not that it happens to be functional, it’s not that it’s got a dynamic type system. It’s black boxes - you go in, you do something, you come out, I don’t care how you do it.

Don Syme: I think that functional programming is an essential base for what you’ve got.

Joe Armstrong: For certain ways of doing non-locking computation, but you can do them other ways as well. It’s the shared state you need to avoid and not that it’s functional or non-functional.

Martin Odersky: One other aspect of objects is in the Simula tradition which led to Beta, another Scandinavian language, which is this aspect of nesting. For me the reason for object oriented programming is that at the end of the day you have to put your stuff somewhere and it’s just impossible to put everything on the top level. If you put this stuff somewhere, then you might as well give these things nice properties and certainly in the line of Simula which led to Beta they did that.

For me that’s it and in that sense, Erlang is certainly object oriented in the sense that the other aspect of encapsulation is that you have an interface and then you have an implementation which is decoupled. It just depends how you want to organize your interfaces. In Erlang your interface is such a very thin wire with messages in, which can be anything and in other languages you essentially define what kind of methods you have.

Joe Armstrong: A consequence of putting things inside things or nested boxes is that the nesting can come incredibly deep. I was looking at Smalltalk and Erlang - the Erlang distribution’s got about 2400 modules and Smalltalk Squeaks have got 2600 classes or something like that, order of magnitude. If you are doing an Erlang system I could tell you which of those modules are essential. I mean it’s code in it, there are a hand of 4-5 modules and with those you can just make a complete Erlang system plus the language. The rest you don’t need. In these modules, in the code, there are like 20-30 functions.

Half of them are unnecessary, you don’t need them. I can make that statement about Erlang. If you look at Smalltalk, I just cannot make that statement. I think "2800 classes? - What the hack is this? How can anybody manage that complexity?" So programmers are spending all their time dithering around. Putting things inside things will result in something that ultimately will be extremely complex. I think we need a sort of flat black boxes with well-defined interfaces and a glue language. We’ve got to stop making programming languages that describe machine instructions.

We have to make programming languages that describe protocols and connections, because we have to limit the complexity of what can be in a black box. Because if we don’t, it will explode and it has exploded everywhere and that’s what’s wrong. I want a philosophy that will drive complexity out of black boxes.

Martin Odersky: I’m not quite sure I buy that because I think it confuses me. You could have systems that at the outside have a very simple interface but at the inside they are incredibly complicated and deeply nested and things like that. In fact, that’s essentially often the things that provide you in the end the most value or the thing you can’t copy because somebody really thought deeply and gave them amazing capabilities. I agree that often it’s misused and you have a huge ball of wires hanging out everywhere.

Joe Armstrong: I just don’t think we figured out how to make components connect. I mean the hardware people seem to have done it. You have chips and you buy them and stick them on a printed circuit board and wire them up. And that’s because concurrency works, there is message boxing and that’s why it works.

Don Syme: In the history of object oriented programming, there is implementation inheritance or inheritance of at least lazy partial implementations or mixing in partial implementations has been a fairly dominant thing. Now, Erlang doesn’t have that and you would always delegate. You would always create processes which delegate onto other processes and F# strongly deemphasizes implementation inheritance. You can do it because it’s there for .NET but really one of the key things we did with F# object oriented programming was to turn it so that you didn’t encounter implementation inheritance until the very last concept in your learning about object orientation. And people do use it, but it’s relatively infrequently. In Scala I think you’ve taken the opposite direction.

Martin Odersky: Yes. It’s very much to components from the ground up and a very powerful component combinator.

Don Syme: I’m a little uneasy with that decision. They way I try and describe this is one of the key things about functional programming or delegation (as you are using processes delegated to other processes) is you can take complex components and put them together and build something very simple. As a result, the bigger thing built out of components is actually substantially simpler than what’s lying behind. I’m sure you can do this in Scala as well; it’s more about methodology and orientation.

The key thing that disturbs me about implementation inheritance is you see people taking what is already a complex base type and making something which logically is actually simpler. But because of using implementation inheritance, it actually appears in the outside as being more complex. Maybe it’s a misused implementation inheritance, but by focusing on it as a major technique, it’s inevitable that people just create more and more complex things using the inheritance and they are mixing things.

Martin Odersky: I give you a counter example of that. That’s something I’ve been involved very much in and that’s the Scala collections. One thing that you don’t see in other languages is that really collections are all-encompassing. A string is a collection, a bit set is a collection, a vector is a collection, a list is a collection, a setter map is a collection, everything is a collection and they all have exactly the same operations. Overall you maybe have 20 different implementations, maybe more that are specialized for different things and of course that would be unmanageable for users to know them.

But they have exactly the same protocols, so you talk to them exactly the same. They have exactly analogous types, it gets complicated when you talk about static typing. If you said "No, we do not want implementation inheritance. We will rewrite 100 operations and 20 collections and make them look all exactly the same." I challenge you to find a programmer that can do that. You will not be able to do it without implementation inheritance. It’s a crucial ingredient for something like that.

Joe Armstrong: Suppose you have a lot of Scala programmers and they write a lot of Scala applications and they’re living in this Scala world, everything is fine and beautiful. And you got a lot of Erlang programmers and they’re writing their Erlang applications and you got a lot of F# people and then we say "You wrote this wonderful F# stuff, you wrote this great Scala. I want to talk to your application. How should I talk to it? - Oh, we cannot do that. That’s a complete mess. It’s all wrong."

Then we say "How should we talk to each other? Let’s agree on how we talk to each other. We can send nil in a message or we can send enumerated types. We’ll agree on that. We’ll send integers to each other. Do we agree?" I mean Erlang could send integers to Smalltalk because we both understand big numbs. But maybe I send Erlang integers to see, and it says "Sorry, we don’t actually do integers. We send 32 bits." And we can’t actually agree on sending integers to each other and then we have a great argument in the standardization committee.

If you start arguing about what’s going on inside my language and that it’s better than your language and we use inheritance and we got that, you’d stop arguing about how we talk to each other and the more important thing is that we talk to each other.

Don Syme: That’s right.

Joe Armstrong: We don’t have any standards while we all agree and we can all talk to each other.

Don Syme: I think by focusing on delegation and by hiding complexity and by creating simple objects on the outside that fits very well in the world of modern standards, on the web service you’re providing or perhaps it’s in the context of .NET that you are providing some other API boundary. Then it is a simple process of taking simple objects, complex things behind and providing a simple overall API to those things. Then you have got some hope of having interoperability between them all.

Joe Armstrong: That means we can’t send high order functions or anything. We can only send ground terms and data types. Then it becomes not as nice. We should be talking about how we send high-order functions that are independent in programming languages and how we send richer data types. We all need a web service description language in XML to talk to each other and everybody implements that in our own language. That’s crazy! It’s really ugly.

Don Syme: It’s not necessarily so crazy. Of course it’s crazy if uses are writing that as their interface descriptions. But for instance the process of implementing web services in a language like C# is really straightforward and simple, right? I mean you could just take a class type and say "We want this exposed as various REST APIs. This one is called FooBar Z" and it’s there. It’s exposed. Then I’m sure there are other things in Erlang and Scala. I don’t think we should necessarily be so resistant to these industry standards for the boundaries, because it is actually more important to have standards and interoperability and have our languages implemented.

Joe Armstrong: But we’re not very proactive there. We thought the standards come and we got to implement this stuff. We are proactive in programming languages, not proactive in the protocols.

Don Syme: That’s in the .NET case, just on the local component boundary for .NET components, that obviously came heavily from influence by the Java and object-oriented design standards. We are more proactive in getting things like generics.

Joe Armstrong: The thing that worries me is I go to big conferences like QCon or something like that, JAOO, and my first observation, if you view it from a helicopter view is the large number of people walking round here talking. I don’t understand what they are talking about and if you listen to what they talk about they are going ".NET, .NET, .NET." And then there is another group of 400 people wondering around there and they go "JVM, JVM, JVM." There are 2 -3 people going "Haskell, Erlang" like this. These groups of people just can’t talk to each other and the interoperability between them is lousy.

Martin Odersky: There are common standards of interoperability.

Joe Armstrong: But ask all JVM people if there are interoperability standards, they’ll say "No we can only do it here, .NET we can only do it here".

Don Syme: If you ask other people they’ll be talking about REST APIs, they’ll be talking about service oriented architecture, they’ll be talking about WSDL and API descriptions and these are good standard interface descriptions.

Joe Armstrong: If you were to have a type system to talk about that you’d come up with something much better.

Don Syme: They are sufficient, they are not necessarily everything that you want.

Joe Armstrong: If you look at the XML DTDs, more or less regular expressions, they were ok and then they replaced it by schemas which are completely unreadable.

   

2. For example, there is an implementation of Erlang on the JVM Erjang and there are other languages that have been implemented in the VM. Isn’t it one way to do interoperability between languages? The same thing that Scala does with Java and other languages on the JVM? The same thing that F# does?

Joe Armstrong: Yes, if you accept the different functional properties that you are going to get. Your garbage collection times would be different and so on.

Don Syme: We have interoperability at the local code level. We get that in spades with Scala and F# but these days we also take that so much for granted that might look different from Erlang context. But by taking so much for granted I’m more concerned about interoperability at the web and data programming standards. I’m not sure about your view of interoperability outside the Java context. I mean you run on .NET as well as in the other contexts.

Martin Odersky: You are right, for the interoperability of big systems. But today we’re really stuck with the protocols, because the problem is even if you have 30 people like us, we couldn’t establish a new protocol. It’s just impossible.

Joe Armstrong: Then you have an even worse problem. When you come to build an operating system, just fill it full of thousands of things which nobody knows what they are and various combinations that are just talking to each other and nobody know how they work and it’s just a complete mess.

Don Syme: I can see you are in this bind. Do you embrace these protocols as a part of a core Erlang experience or do you resist it and keep a sort of clean distributed? That’s what seems to be more because Erlang does have a definite view on distributed programming, which works extremely well, Erlang to Erlang distributed programming. That’s great, but as people move to the applied distributed programming, they hit boundaries where they actually want to no longer be homogenous, they want to be heterogeneous and that is a big question for Erlang at those boundaries.

Joe Armstrong: The Erlang model says everything from the external world has to pretend it’s Erlang. At this conference we’ve got people from Riak and CouchDB and they are building their Erlang applications when they are having coffee. When we have to talk to each other we happen to be on the same platform and we don’t have to go up through the protocol stack over and back. We can just go directly and we’re interoperable.

Don Syme: It’s a lot easier in homogenous situations.

Martin Odersky: It’s more efficient, too than others.

Joe Armstrong: I think I gave some lectures to QCon. I think it was a big mistake made in something like RFC24 or something. It was the simple mail transfer protocol where the transport protocol was defined in text with an ABNF grammar to define it. Then people sort of tended to reproduce earlier things. They cut and paste and old spec and changed it a bit. If the original RFCs have been defined using s-expressions, then you wouldn’t have something like 4500 different grammars between these components that we use. It is a dreadful mess.

Don Syme: We are sort of talking about the sheer complexity of these s-expressions.

Joe Armstrong: Yes. Write them all with s-expressions or Lisp or whatever. It doesn’t matter, as long as you choose something that’s reasonably expressive.

Don Syme: I think I must have the good fortune of working in the good context of .NET or I’m sure Java as well, where teams of programmers have made all those connections work. I know, I’ve used web services every day, I do use them all the time and I’ve never had a single bug in the actual implementation at the protocol stack kind of level.

Joe Armstrong: But then the whole world’s not going to use .NET, you see.

Don Syme: No, these are WSDL web services and many are using Java and many apply to other languages. I think there is good interoperability, but it may not get the massive efficiency that you are looking for in a homogenous situation.

Joe Armstrong: It goes into networked components, it doesn’t go into components on one platform. The organization or operating system is organized that way. You can’t view it as a collection of components you send messages to, so it’s the organization mostly.

Don Syme: Some components, even on local machines are going definitely in the direction where there we’ve been moving user interfaces towards just effectively having the same split you have on the web where you have HTML JavaScript as the display language and they call background components providing that they are just using that on the local machine. This is of interest, it’s a trend, so it is occurring more.

Joe Armstrong: I think programming had to stay the same for 25 years, because the architecture hadn’t changed and then the multi-cores and the permanent connection of people to the internet means distributed programming. Distributed programming, when you are permanently connected to the internet is as the social networking, the file sharing networks are a reaction to that and it’s a reaction that’s only four years old, so we don’t know where this is going at the moment.

Don Syme: Absolutely. This is a total game change. We were talking at the Microsoft Research just the other day about what are the big changes ahead. I think it’s the fact that we can assume in the programming context that the internet is there, or in many reasonable programming contexts you can assume it’s there.

Joe Armstrong: Swapping stop virtual page, swapping stop at your hard disk because you won’t connect it to the network, but there is no reason where you are permanently connected to the network why swapping doesn’t stop at your hard disk. You just swap into the cloud and then you can just smash your computer to bits. You can go to anybody else’s computer and you just swap back in and you’ve lost the last page that was dirty or something. Why doesn’t swapping go all the way into the cloud? We can’t do reliable storage on that machine because if I smash it you’re broken, but you can do more reliable storage in the cloud. I think that’s the need for programming languages which can express that in a simple way. There was no need for that five years ago.

Martin Odersky: Exactly, I agree.

   

3. Moving to another topic, you start with a language and you try to make it perfect and this is the first release. Then you get more demand for more features. How is it like to grow a language and add in features later on to the language either by syntax or by libraries? You have three different experiences.

Joe Armstrong: Actually it’s quite fun because fortunately, not having a lot of people using the language, so we could change it all the time. Erlang must have had 3-4 years of just me and Robert [Virding, co-creator of Erlang] changing stuff and then it had users. We just put stuff in and if they used it, they used it, if they didn’t use it we took it out again. We could roll a new system every four days or so and do that for 3-4 years and then freeze the decisions.

Don Syme: I’m sure we’ll all look back and think that phase is a particularly enjoyable time - growing a language.

Joe Armstrong: Now, in order to change Erlang we have committees talk about it, you can’t just do it.

Martin Odersky: There is no worst curse than having too many users too quickly, because it will be locked in all the wrong decisions. It gets more and more difficult to actually redesign thingsin ways which are only so slight that it was incompatible. Sometimes we still take the liberty to do it but it becomes more and more difficult and then you have the problem that it’s always so much easier to grow than to take out things. So, you have to battle that again, that you don’t want to become another huge language where just every five years you have a new feature set in there because that would push the language over the edge also fairly soon. So, that’s difficult, yes.

Joe Armstrong: You talked about adding a tail call to the JVM and something I was thinking if that would have been in the early days of the JVM, somebody could have added it in a day. Now it’s incredibly difficult.

Don Syme: You have to get things into the base platform early and that’s absolutely important.

Martin Odersky: Because the security models rely on that being there.

   

4. What were your solutions to their problems? If there is a problem, once the language is popular, you can’t change much. But what did you do for example, Martin in Scala?

Martin Odersky: We also grew fairly slowly. We were out in 2003 basically just us as users and like Joe, we had maybe three years before it grew really popular and we couldn’t do much anymore. There was a big redesign from version 1.0 to 2.0 and it was fortunately only after we redesigned that the adoption picked up so we had essentially some migration. I think there was actually a migration mode in the compiler that would give hints how to rewrite stuff that would detect incompatibilities and give hints how to rewrite stuff. But since then, we’re very much constrained, yes.

Theoretically, we’re in a better position because we pushed so much in the libraries, but practically it comes down to almost the same thing, because once you have a subset of co-libraries they become in essence part of the language and you can’t change them anymore. In the libraries maybe you can get away a bit easier with deprecating things, having a migration pathway for some time, you have the old and the new solution next to each other and then in some future version you cut off the old ones. But it’s fundamentally almost as difficult as changing a language.

   

5. The actors library has been re-implemented by another open source community, right?

Martin Odersky: Yes, twice. One is Aka in Sweden and the other is LIFT.

   

6. Don, the same thing I guess you did with computation expressions.

Don Syme: The whole experience of growing a language first of all feels very similar to the one described. Certainly, we learned a lot from .NET about how to evolve library sets in early versions, to be able to actually communicate to users why something is changing and give them one or two versions, to make adoptions. I mean up to now, up to 2010, to F# 2.0 we have made changes and that was definitely a stabilization point and it’s sort of backwards compatibility from this point on. We had a good landing at that point.

We gave ourselves two years (from 2008 to 2010) to get this thing really sorted and stable and that’s been great. I’m at the moment actually just fantastically enjoying our sense of stability. The fact that all the F# samples we’re writing all work and they always continue to keep working. The feeling that nothing is going to change at the base level is really nice, so we got to a good point. It’s been a very exciting time growing the language.

Joe Armstrong: The sort of early users didn’t mind. Up to even when you had hundreds of users they didn’t mind if you just changed everything the next day, because at that stage they hadn’t written millions of lines of code. I don’t think even the people who are late adopters actually mind if you changed the language, but they do mind if they’ve written a million lines of code because they’ve tested it all. It’s the actual testing of the code that’s the problem, it’s not the fact you are making a change. Because if you make a change it’s better they want to adopt it. But if they’ve tested it all, frozen it into something, they don’t want to retest it.

Don Syme: We will certainly have very strict baseline on backwards compatibility from this point, so we won’t be changing it. I’m happy with that.

Joe Armstrong: When you start a new language you don’t think lots of people are going to use it, so you don’t really design into it the mechanisms for changing it.

Don Syme: We did right from the start.

Joe Armstrong: But you just said it’s backwards compatible, so you don’t have things to change.

Don Syme: We don’t have now. Right at the start we designed in the obsolete attributes in the library and user code can do this too, as they design their libraries they can start marking things as obsolete and a nice message and the third party user of the code to say "I shouldn’t use that anymore".

Joe Armstrong: What I’d like to see and we don’t do is integrate the language within the revision control system because if you rename a variable, and you submit it to the revision control system and say "This is a completely different version of a completely different program but you just rely on the variable because there is no semantics at all just to make it a little bit more readable if you correct a spelling mistake in a comment." The revision control system is a sort of generic and applied to any programming language and that’s really stupid. They need to be much more smart to check what really happening, not this superficial stuff.

Don Syme: So we’ll have Erlang RCS, Erlang specialized?

Joe Armstrong: Yes. Because I think that people first of all write a problem, they solve the problem and then they sort of optimize this code, work on it and the code becomes very efficient but unreadable. What I think they should doing is specifying it with a domain specific language or some higher thing and then writing a compiler and then changing the compiler because it’s not efficient. Because then they would have the benefits of a clear specification and a fast implementation. What they do is they don’t keep these things separated and the language doesn’t support them separating it like that.

Don Syme: Thank you to the Erlang Factory for having us here. I think it is so wonderful to have this track. First of all they invited Martin as a keynote speaker, but they also have this wonderful track which is called "Anything but Erlang" where they invite people. They invited me last year and I came back this year and I think that’s such a good thing to have a conference that is a sign of a healthy intellectual atmosphere.

Joe Armstrong: Thanks for coming and thank you, Martin for coming.

Martin Odersky: I wanted to echo that sentiment. It’s great to be here. I think it’s really the first time that we have three languages in the functional space, but very different languages together. I think it’s a great thing to do. You have such a lot of work to do. There is this huge pool of Java and C# programmers.

Joe Armstrong: I think our programming languages are sitting in the bookshelves in shops together. There are lots of Scala books sitting next to Erlang books and F# books and all these books are talking to each other when the shops are closed.

Don Syme: Honestly, looking back 10 years ago, who would have thought?

Martin Odersky: That’s right. But we have a lot of work to do and anything we can do to learn from each other how to talk to these programmers because we know that taking the initiations into functional programming is very difficult for lots of people. Working together will help and that’s great.

Don Syme: I’ll be running off to get a copy of Martin’s book to learn all his tricks of the trade. I’m looking forward to that. I’ve got a copy of the Ozone on my desk.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT