Bio Guy Steele is a Sun Fellow for Sun Microsystems Laboratories, working on the Programming Language Research project. He received his A.B. in applied mathematics from Harvard College (1975), and his S.M. and Ph.D. in computer science and artificial intelligence from MIT (1977 and 1980). Prior to joining Sun Microsystems, he was an assistant professor of computer science at Carnegie-Mellon University
I am a Sun Fellow at Sun Microsystems laboratories right now and I've worked at a number of companies before that, Thinking Machines Corporation, Tartan Laboratories; I taught at Carnegie-Mellon University before that. My interest has always been in programming languages; I've been interested in programming languages for almost 40 years. I guess I did my involvement from first years in FORTRAN in 1968 when I was still a young teenager, got involved with APL and LISP and other languages and after that I've just enjoyed learning language after language and trying to understand their relationships near design structure.
One thing you can learn is that there have been different concerns about programming over the past 50 years and languages are tended to redress those themes; for example pattern matching, that was a hot topic among the 60s and a lot of languages were developed just to try to help you solve the pattern matching. SNOBOL, for example, to do string matching and there were some artificial intelligence languages. They focused on that and they thought that if only we could match the right knowledge, then we would have the answers to our disposal to solve the problems and they made some progress that way. Later other topics are just concurrency became an issue and concurrency is addressed more beginning in the 70s and 80s, good languages such as Concurrent Pascal.
I think in the beginning programming languages tended to dress the problem of scientific computation in business programming, but as we build more and more tools systems programming became sort of a third topic area in some languages such as C and Pascal used right operating systems so there are a number of languages developed to tackle those. There are questions of types and what information a compiler could figure out about a program without actually having to run the program and so there has been a period of exploration of more and more elaborate type theories, and those got elaborated, I think, during the 80s and 90s. Finally when we reached the 90s and the explosion of the Internet we were concerned about programming the Internet, programming World Wide Web and worrying about issues of safety and limiting virus transmission, and also to make it easy to download codes of platform independence became a real issue once the Internet came into its own and JAVA helped address that issue.
So I guess my point is that one reason there isn't a single universal programming language is that our needs have changed over time and languages tend to change or emerge to address those needs. So it's important for a developer to understand his historical context, I suppose, by historical context I mean the last five years, now that we get there, or it might mean the last 20 or 30. But to understand what are the perceived needs to are driving the language designs and, of course, this is an endless circle, it's not a matter of a beginning and end, but things are constantly progressing through time.
I think JAVA is serving a certain need which is very well, which is commercial processing on the web, transaction processing, supporting internet commerce and levering other kinds of content to web. It's also being used on servers to serve web app content to the web. I would say that JAVA is currently not addressing high and scientific computation for example and I think that's actually a conscious decision.
There was a movement in the late 90s on the part of people in places like the National Laboratories and people who've been FORTRAN programmers, the one who gets some of the benefits of JAVA such as platform portability and proposed a bunch of changes in addition to JAVA that would support their needs and they were rejected by the JAVA community process because I think the JAVA community as it existed had found a certain central sweet spot and we were afraid adding too much might disrupt their comfortable situation and so, I think that is OK. It's OK for a language not to try to support everyone in the world, but then those who aren't supported they need to find some other language, some other niche to occupy.
There has been a fascination on the part of … with parallel processing for a long time and super computers have been parallel for the last 20 or 30 years and I worked for a super computer company, for example, Thinking Machine Corporation, we were all excited about building a parallel computer with thousands of processors and trying to get the software working on that, we thought it'll be only a matter of a few more years before that parallelism would hit the desktop.
We were off by 20 years, but it's now arrived. We are seeing multi core chips, and it's not so much we've reached the end of Moore law as that the economic tradeoffs have reached the point where we're better off putting more than one core in a chip, even for desktop computers, even for laptops. And so now that we have become widespread the software programming is becoming an issue, and so we are going to see our next 5-10 years languages designed.
John Miller, in the Computer World last March, was saying that if your programming language doesn't address parallelism it is going to wither and dye within the next 5-10 years. JAVA, of course, does support concurrency. It's got more multiple threads and I would say that was the first programming language that made platform independent multithreading mainstream which was great. But it's a 10-year-old design and there are some issues with the way JAVA addressed it; in particular I have often said I think one of that major design mistakes of JAVA, and I wish I had caught this 10 years ago, is that to hand out a pointer to an object for any purpose is to hand out the ability to lock it; locks are public, it would have been better if locks were private and that is probably something we can't retract at this point.
So we can add new concurrency mechanisms to JAVA and sort of abandon the old locks, perhaps even a new locking mechanism, but the existing mechanisms are going to need some revision for sake of safe concurrent programming. I guess I should've explained the reason why you don't want to hand out a public pointer to a lock is it allows denial service attacks. If a slightly untrusted piece of code decides to lock something you don't expect to be locked, then that object does become unavailable to methods who are to be using it, and so there are some safety issues involved with exactly the way you go about supporting the synchronization for concurrency.
A couple of things first: I hesitate to recommend any languages "the best" for a job in such a broad category. Second, Erlang has already earned its place I think for a certain kind of concurrency application such as telephony and grew up in that arena, apparently is very well suited for it. I am not an expert on Erlang, I've read some papers on Erlang, I've read Erlang programs, I've not written them, but it seems to be very clean language design, from my point of view as a language designer. I think that JAVA is perhaps a little more general purposed than it has been designed and used more widely throughout the World Wide Web, which is nice.
I am always for interoperability and I am always for using the JVM or other VM's as platforms for multiple languages. Erlang does have some strengths and I think it would be great to try to explore the interactions of Erlang in JAVA. I speak as one who's learned a lot of programming languages, a few of them in great depth and others in less depth but enough to be modernly fluent in them and I find that every new programming language I learned increases my breath of knowledge, my breath of expressiveness and even if I don't write any more code in that new language it changes the way I think about writing code and languages I already know. So I found value in learning lots of different programming languages.
My early LISP code was in more of a FORTRAN style, now it is in a more APL style because learning APL caused me to think more globally about aggregates of data in the way I process arrays, so I tend to think about processing the entire lists when I write list code and of course my knowledge of APL and LISP tends to influence by JAVA style as well. I hope for the better, so yes, I think it would be great if languages such as Erlang could interoperate with JAVA and I think it would be better for people on both sides to learn the other style.
I think a practitioner needs to know both JAVA and either C or C++, just because those are the big working languages today. Then I'd pick two or three other languages for stylistic enrichment; I guess I would recommend perhaps LISP, FORTRAN and APL if you wanted to pick old time languages. More recently you might want to take a look at Ruby and Haskell, to pick a functional language, and COBOL, which has some interesting ideas in it you won't find in some of the other languages.
8. Back to the parallelism topic, a lot of developers are asking why isn't threading good enough. What's the difference between multi core and multithreads? Why do we need new paradigms and constructs for that?
It's because the threads hardly give you anything more than just the knowledge that you have multiple cores, multiple processors working on your job. Having multiple threads is the lowest level primitive that gives you access to multiple processors, and it's like having just GOTOs without having FOR loops and WHILE-DO and subroutine calls. The problem is not in having access to processors and figuring out how to organize the parallelism for your benefit, as you need to figure out what the higher level of abstractions are that you really want to work in most of the time. Yes, you want to dig down there and get the low level of things. It's like having arrays instead of just address arithmetic. So we want to figure out how to organize processors into groups and clusters.
I find it very interesting that programming languages and people who design programming languages that provide for control parallelism very often use metaphors out of social science. They'll talk about things like master-slave relationships and producer-consumer relationships, they'll talk about whether threads are dependent or independent, and these have very strong overtones of social organization and social relationships, and this may be because a lot of our experience of organizing parallel processes before computers had to do with organizing people, so as to work cooperatively or collaboratively. And so one can talk about hierarchies and various kinds of peer to peer and collaborative relationships and try to gain some inside from those metaphors.
Well, one of them is simply the idea of data parallel programming which, again I keep mentioning APL in part because I am here at a conference where APL 2007 is collocated, which I find exciting; I have had an interest in that language ever since it came out, which has been 40 years now, it came out in the 1967, I think, as an IBM product. The data parallelism is that you organize your data into collections, typically arrays, and let the structure of data guide the structure of parallelism.
Another organization is so called "separate address space languages", where the model is that you write a sequential program, but then you run many copies of it, then you have some way for the different copies to communicate, typically by message passing or through some kind of restricted shared memory structure. Yet a different way to organize it is to say there is one big address space, or to put it another way in high level language terms, there are many processes that each has access to all the same variables and there the problem is managing the concurrent access to the variables so they are not stepping on each other's feet all the time.
And that's very much like having a classroom full of students and a single blackboard and they can all write on the blackboard; we have to make sure they are cooperating the right way. So these are different ways of organizing a parallel computation, and we want to design languages to support one or another of these metaphors in a clear way. There are other ways of doing it as well that are more exotic, but those are sort of the big three that I can think of.
I think that is fantastic. In fact they are just added, in some sense, to every programming effort begins with trying to figure out what it is you really want to talk about and then you build up a library that supports that. Every time you write a subroutine you're in effect customizing your base language to be a little bit closer to your domain. We think a domain specific language uses a fairly comprehensive set of such subroutines or macros or some kind of facility, so that by the time you are actually thinking about the application, you are thinking in terms that are closer to the terminology of domain than the terminology of programming, and so you are not thinking so much about the mechanics of subroutines and increasing i by one and things like that, and you are thinking more of each step about something that makes sense within your proper domain.
But many programming efforts begin by first building o domain specific language or at least a domain specific set of subroutines and then going from there. So to that extent many programmers end up being language designers although they don't necessary think of it that way. Now, there are people who consider those language designers like an application programmers who set out to build specific languages to support certain kinds of domains. We mentioned Erlang earlier, in fact, perhaps Erlang started out life as an attempt to build a domain specific language for making telephone connections and turned out to be much more general than that in every way, which is great.
I like to point to something like Make, which is an extremely domain specific language for describing how to compile and build components as to make a complete system of some sort and that is so domain specific that is not at all general purpose, it's probably not complete, but it handles the specific task very well. Bill Gasper, he was one of the old time hackers at MIT and he said: "Don't forget a data structure is merely a stupid programming language, and its point is that a data structure in effect, when given to the program that processes it, is guiding a computation and in effect constitutes a dumb little language that turns a specific application into a more specific one".
First of all it helps to study the literature of languages; if you want to become good at anything, you should read about it first. So if you want to design new languages you can be better at it if you study existing language designs and sort of try to pull them apart and learn what makes them tick. There is a technical vocabulary that you can learn that helps you to talk about the features of a language; we talk about things like types and binding and scope, there is lexical scope and dynamic scope and variable access to inheritance and things like that.
These are terms that we use and if you can come up to a new programming language with these concepts in mind, it can help to more accurately pick a part, how are languages put together in terms of why it was put together the way it was, and that can help you design languages. Now that said, it would be really boring if every language we spoke was designed by grammarians; languages ideally should be living as well and the design of a language should be driven not only by these theoretical considerations, but also by the needs of a specific application domain. So my other advice to a would-be language designer is really understand the language, at least one application domain you hope language will address, understand that well and then try to figure how your language addresses that.
12. What do you see as tradeoffs or some guidelines tour when doing domain driven design, creating new APIs that are expressive as domain concepts versus creating new languages that are expressive as the domain concepts?
I guess I would see two differences between those and those would be convenience and convention. It's certainly convenient for the developer of a domain specific language to work within an existing language and develop APIs to just involve method calls and function calls, and that also makes it easy to use because you've got existing tools and framework for using that existing language such as JAVA. The downside of that is maybe less convenient to use in part because JAVA method calls may not be the conventional notation of the domain; if it were physics for example the notation of physics doesn't look anything like the notation of JAVA and so you're interpreting things like ".add" when you want to add, instead you want a plus sign, that kind of thing.
So in ability to support the conventional notation of a domain, may make it less convenient for a user to use, on the other hand use an existing framework may make it more convenient in other ways. So to the extent that you want to aid the application developer by letting him use the tools that he's used to use in his notational conventions and his concepts, you can develop the concepts of any notation, but if you can provide the notation as well that's a boon. Now that said, one could choose a language other than JAVA in which to embed domain specific language and turns out that some languages as base languages are more extensible and allow wider variety of notations and others.
LISP, for example pretty much imposes their requirement that you start with everything being parenthesized. That may seem awkward, but on the other hand it provides an elaborate macro facility that in other ways may allow you to adjust the notation. There have been other extensible languages designed in the past. The extensible languages were big deal in the 1960s and they kind of fell away in the 70s, and there may be a better revival coming up as we try to build language structures and frameworks that better support the development of domain specific languages.
Fortress is one good example of that, the language project that I am working on now that's on labs. Fortress started out life as an attempt to build a high and scientific programming language, but we ended up designing a framework for language experimentation in which our first experiment was the scientific language, and to that end we've tried to add a lot of facilities for notation extensibility, and one of the things we want to experiment with is building other very different domain specific notations to see whether this idea of flexibility pays out or not.
I would just like to emphasize it's a mistake to expect anyone language to solve all the world's problems and because the tradeoffs keep changing the tradeoffs of hardware design, the kinds of applications we are facing, changes in the uses to which you put computers. I expect us to need to continue to develop new programming languages for the foreseeable future, just because the needs keep changing.
errors in transcript
He said that you should learn Haskell, not Pascal (Pascal is neither a modern nor a functional language). And it's not "parallelism would hit a death stop" but "parallelism would hit the desktop". Also, Java is spelt Java, not JAVA.
Ahmet A. Akin
Re: errors in transcript
Yep, that should be 'Snobol'. But who under 35 would know what Snobol is .....?