Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Dan Ingalls on the History of Smalltalk and the Lively Kernel

Dan Ingalls on the History of Smalltalk and the Lively Kernel


1. We're here at QCon London 2010 and I'm sitting here with Dan Ingalls. Dan, why don't you explain to us what you've been doing for the last 40 years?

It's a long story, but if we go back to the period that I talked about in my talk, from 40 years it began with my first experience in programming and then becoming associated with Alan Kay in the work on Smalltalk. That kept us busy, productive and entertained for really a couple of decades. There was a period when I left the industry for about 10 years to run a family business, but then I came back and that was the time when I did Squeak with some other people. That grew out of the fact that it seemed to me that when I came back software had not changed much, but the computers were much faster and allowed us to do a lot of things, it was really sort of a better time for Smalltalk. Then I retired for a brief period then came back to work at Sun and at that point, things were happening on web that I wanted to play with and spent a while wondering why that was so complicated and got the idea for the Lively Kernel and that's kept me busy pretty much since that time.


2. Let's start out with the first period - the Smalltalk period at Xerox PARC. What was Smalltalk-72 like? It's a very different language than the language we find in Squeak. It's based on message passing, is that true?

Smalltalk-72 I think of as the coolest scripting system that ever was. What it is, it's a programming model where in the code you have access to essentially the token stream of the code that called you. If you look at some of the original compiler compiler work, in current compiler works, you are essentially looking at the incoming token stream. If you allow that to happen in a program, it allows the program to parse on the fly, which is a mistake in normal programming, but what it allowed us to do is to play with different syntaxes for what we wanted to do and to actually choose to have message oriented semantics in the language. We put together the system Smalltalk-72, which could have just been something like Logo, except that it was powerful enough to talk about messages and we had the opportunity to develop a style for an extensible language, one with classes and instances and a style for message sending, as the model for processing. It actually was used productively for 4 years. We did several iterations of a curriculum for a school with children to find out if that was teachable. Then we had pretty creative people in the group working also on graphics and music, which we also used the system to describe. We got a lot of experience with how we wanted the object oriented style to work.


3. The fact that in Smalltalk-72 you were parsing essentially the token stream, this reminds me a bit of Lisp macros. Do you see any connection to that? Did those exist back then?

We were aware of Lisp. It was an inspiration for Alan and for me and for other people in the group, but I'm not sure if there is really about that. I think of Smalltalk-72 as being sort of a marriage of Lisp and Meta. The META I'm talking about there is a paper about it called META-II, it was written by Val Schorre in 1962 and it's a beautiful paper. It ends up with a compiler compiler that's written in itself in about a quarter of a page. It's another gem like McCarthy's Lisp eval. I recommend that paper to anybody who is so interested.


4. It strikes me also that Smalltalk-72 was very focused on the messaging, because there was no concept of methods, as in later Smalltalk. Because you are really just getting a stream of data and you have to interpret it.

Right, we were just in the process of discovering that. If you look at code, in Smalltalk-72 you'll see a class description that is just like a function with a whole lot of code, but the style we quickly evolved was one where that code was testing to see if it was this message or this message or this, like keyword testing. Then, the little sub-bodies of code that went which each of those corresponded to methods. We thus got experience with that style, but it's true that the language did not have a syntax for it, that came later.


5. That was then Smalltalk-76. What were the big changes, the big feature list for Smalltalk-76?

Smalltalk-76 came to me all as a unit during a weekend at the beach. When we had gathered this experience of how we wanted things to be, but the language itself (or at least the programs) ran really slow because they were having to parse on the fly. I really wanted to get it to where it could be compiled and run by some sort of virtual machine. We had come up with this style of using keywords to make things more readable, especially for the kids and essentially for doing all of the message testing was testing on token names. That's where I came up with the keyword syntax. You can actually see pieces of it.

The colon that takes an evaluated argument in Smalltalk-72, wound up being a piece of the keyword in Smalltalk-76. Also, infix arithmetic had been dear to my heart ever since I encountered APL so that I preserved so that's a part of the syntax, but the syntax of Smalltalk-76 and Smalltalk-80 is very simple. I think that's one of its virtues. Anyway, part of that whole gestalt that came to me was not only that way of having a fixed syntax, but that could flexibly allow for all sorts of message patterns. Together with a byte encoding that you could compile to. In that I was motivated by another thing that was happening at the same time at Xerox PARC, which was the development of microcoded computers. In the back of my mind there was some previous work done by Peter Deutsch on a microcoded Lisp or a byte encoded Lisp. I knew that if we could compile the pieces of our language to small code syllables, we could execute that very fast in these microcoded processors. That was in the back of my mind and, if you take a look at the Smalltalk-80 virtual machine it all goes back to that sort of whole gestalt.


6. When did Smalltalk images come along, the image concept? Was that in Smalltalk-72? Did it come later?

It began with Smalltalk-72 and I would trace it back even earlier to one of the things that I had loved about APL - it was the APL workspaces. In APL you could do some work, you could bind some variables, you could define some functions and you save that workspace and when you start it up again you could load that workspace and everything would be exactly as you left it. Because of course they didn't have any graphics on the screen. Then, when we did Smalltalk-72 there was that convenience in the back of my mind and also the Alto Machine - we had these disk packs that stored less than a floppy disk, but you could load the entire program state from the disk to run. It was like a personal computer, but used by different people. People would come in, they would load the disk pack and then start to work and then leave and take the disk pack with them. We were working with children and we wanted the children to be able to do some project, put some stuff on the screen.

Then, if it was the end of their session, we wanted them to be able to save that and come back exactly the way it had been when they next had a chance to work. It might be on a different computer. The simplest way to do that was the Alto operating system had something that would in just like one second unload the entire state of memory onto the disk and another symmetric operation that would load it back in from the disk. We seized on that and said "Fine, just load the whole session out and load it back in." That became the image model. It stood the test of time for us. There is a downside of working that way, but it worked great for our productivity and you see that later on in Squeak's rampant portability. Squeak runs bit identically across platforms of different endianness and different processors, so you can take a Squeak image that's been stored out, ship it across the world to somebody with a different shaped screen, different color depth, all that and it just works.


7. You mentioned the downsides of the image. How could you work around these downsides or what are the downsides?

There are a couple of downsides. One is that it makes the system somewhat a slightly heavier weight thing, although nowadays, with the size of current computers it's nothing. The other is (people talk about it as) the walled garden, where you tend to do everything in your own system, which is a wonderful thing and many of the virtues of the Smalltalk world come about that way. But if you then want to work with other external programs, other systems, it's harder to do that because you may have your own encoding for objects and they have to be translated going in and out and so on. I think of that as being less and less a problem now as we go to thinking more in terms of network computing than we get used to these interfaces to foreign systems. But during that period, it was a barrier for Smalltalk to integrate as well as other say systems in Unix or Linux that were set up to flexibly use a bunch of different things at the operating system level.


8. It strikes me that this problem will go away. In a way I think Alan Kay used the idea of cells to compare Smalltalk instances to cells that only communicate via messages on a higher level.

I think the model for the future is going to be more and more the Internet all the way down. It will be easier to mix and match between a system like Smalltalk and the rest of the world.


9. You mentioned the Alto computer at Xerox PARC. A question I always wanted to ask is there is a rumor that Smalltalk was the GUI at Xerox PARC, the only GUI. Is that true? What were the systems at Xerox PARC?

That's not the case. Xerox PARC had a couple of different laboratories. There was the so-called Computer Science Laboratory and we were in what was called The System Science Laboratory. Our group is actually called The Learning Research Group. There was the Smalltalk system, but there was another project that produced Interlisp and they had their own GUI. Then, in the Computer Science Lab, there was a similar other system that was worked on by Chuck Gaskin, Martin Newell and people like that and they produced other systems like the Bravo text editor and a lot of that work went on into the later Star product that Xerox actually sold.


10. Those other systems had their own GUI?

Yes, each of them had their own GUI. A couple of things showed up first in our group and it was wonderful at Xerox PARC that there was a lot of sharing of work. I think this has been written about in books. We had this usually weekly meeting called "Dealer" where we would participate along with people from the Computer Science Lab and somebody would say "Wouldn't it be cool if we could do this particular graphic thing?" and we'd come back and program it up and give demos back and we'd share back and forth in that way? Some of the things were done first in Smalltalk and then spread quickly. Pop-up menus is an example of that and we had overlapping windows on the screen before the other groups, but something like that you just see once and you want to do it. That kind of thing spread virally within Xerox


11. Which system did Steve Jobs see?

The system he saw was Smalltalk and I was one of the 2 people who gave the demo I guess Adele was involved in the negotiations and talking about it.


12. Moving on another 4 years to Smalltalk-80. How did that come about? What were the changes to Smalltalk-76?

Smalltalk-80 is very much like Smalltalk-76. The Smalltalk-80 virtual machine is almost identical. All of that was really done with Smalltalk-76. We changed the numbering on it mainly because it was a time when we were getting ready to put Smalltalk out into the world. One of the things you see in Smalltalk-76 if you look into it is we had our own special character set with lots of glyphs that made sense to our language, but did not appear in the Ascii set. One of the first things to do was to make it Ascii compliant and we did that almost completely. The one thing we didn't do is we left the left arrow for assignment which looked funny on computers that had underbar in that place. That was one of the changes. There were a couple of real improvements, we put in, Booleans. The Smalltalk-76 was more in the Lisp model of nil equals false, and then I unified blocks in a way that was different from Smalltalk-76. That's pretty much it.


13. Blocks had appeared in Smalltalk-76. What was the change in Smalltalk-80?

One of the problems with Smalltalk-76 is you didn't get an object for a block that was passed in. You got essentially a pointer back to where it came from and it made it the case that if you wanted to pass a block, there was no a way to pass a block that you were handed onto another procedure at a lower level. If you had something like that that was 3 levels deep, you had to essentially do the overhead of creating blocks 3 times down. There was some complication in the syntax in order to allow the effect of unevaluated variables in a block call in Smalltalk-76 that complicated the syntax so that you can have a keyword that either evaluated its argument or didn't. We got rid of that in Smalltalk-80.


14. Where did blocks come from originally? What lessons did you learn in Smalltalk-72 to get to blocks in Smalltalk-76?

Smalltalk-72, as I say was this completely flexible thing that parsed on the fly. When I did Smalltalk-76 one of the hardest things to do was to figure out how to make something compilable there to go there and yet that would pass essentially variables by name and not by value and still have it compiled. I just did what it took and if you look at Smalltalk-76, you'll see what the solution was. But it did the job and in Smalltalk-80 the blocks that we got there were just a more mature version of that and simpler in a way in that you didn't have this difference between unevaluated and evaluated keywords.


15. There was some controversy about blocks in Smalltalk-76 or Smalltalk-80. Some people didn't like them. What was the case there?

I think one of the aspects of blocks that people object to is that they hand out a capability an access to within an object that isn't in its external protocol. If you store that capability somewhere, it's a breech of security. What we did at that time - that put us on a new plateau was really great, but I think since that time we've learned some lessons and that aspect of blocks is one of the lessons we learned.


16. Has that problem been dealt with or I suppose it's still a problem that you expose an internal information with closures?

One of the things you want to do is not be able to store blocks. I think the approach of closures is a better approach in that sense. Some other things we've learned, such as the bad aspects of shared mutable state. If you want to look around at Smalltalk-like work that's more forward looking, I think Gilad Bracha's work on Newspeak is probably a good place to look. A lot of that goes back also to the work done by Mark Miller on the language E. Actually I don't think of E as being as much a language as an architecture, but it has informed a lot of current work on more secure systems.


17. The language E brings the concept of capabilities, is that right?

The basic idea is only, a piece of code should only get a handle that has only operations that are safe to perform on it. You don't give somebody a handle to the file system, you give them a handle to a specific file if you want it to be written on.


18. That avoids having to do security checks and things like that.

In fact it constitutes a security check because what else can you do with it?


20. It's kind of philosophical.

It is philosophical.


21. The '80s passed and then you invented Squeak. Who came up with Squeak?

Here is how Squeak came about. I was about to go back and work with the old crew at Apple and we knew we needed a software system that we had domain over and so we figured out that would probably be Smalltalk and we had specifically the old Apple Smalltalk as something we could use because it was in the public domain, but the virtual machine we had for it was a pile of 68K assembly code and we didn't want to have to go deal with that. This was after my time of being away from the computers for 8 years and they had gotten so much faster that it occurred to me we probably could actually run the reference implementation of Smalltalk that was in the Blue Book and actually use it. It would run slowly, but it would be usable. I had this flash - I still remember driving in the rain and thinking this thought that the way that code was written, it could be translated to C. It was specifically written to not use messaging really, although it was in the Smalltalk syntax, so that it could be a guide to other implementers who read the Blue Book. Therefore, by definition, it was translatable to something like C and we ought to be able to just write a translator in Smalltalk from that to C. That was the germ of Squeak.

Other than that, Squeak is just another Smalltalk-80, but the cool thing was that it would then include the code for the Virtual Machine, which would a) mean we had a Virtual Machine that we could easily debug and change, so it would be nice and malleable and b) it would be really easy to give to other people to do ports. Squeak has a very different character, you know - it's full of flexible graphics and stuff that you don't see in the other Smalltalks and the reason (I think) is that having the system including the primitives, the virtual machine and the graphics, all written in good old malleable debuggable Smalltalk. It meant that there were lots of things that would be fun to do, would be creative that we wouldn't normally have done if the graphics were done in some less malleable way, but we could just try them very simply. The work that I did on WarpBlt, this flexible scaling rotating version of BitBlt, I don't think I ever would have approached it if I'd had to do that in C, just because it's a big complicated thing. But I wrote a version of it in Smalltalk, which I can sit there and debug. That was also during a trip to New Hampshire and it took 3 days to get the kernel of it all working. Then, because of this wonderful simple push button translation to C, we had it working at full native code speed a day later.


22. In your talk, you mentioned the concept of a kernel as a basic set of concepts that you can build on. Could you go into that a bit? What does this entail?

A kernel it's an abstract term and people use it all different ways. People talk about the Unix kernel and this and that. I think of a kernel as just something that a lot of people talk about it as a platform. It's just a set of APIs or definitions that you then can build other stuff on top of. You can have a graphics kernel and what is that's all you need to do whatever you want in the world of graphics. There are other kernels that maybe provide all that you need to run an operating system.


24. The word kernel brings us to your current project - the Lively Kernel. The Lively Kernel is written in JavaScript. The first question - why?

JavaScript is here. It's all over. The project came out of my being interested in getting into what people were doing on the world wide web. When I looked at it, it looked really complicated because I was looking at it from the standpoint of personal computing and how do you take this medium which is the web and your browser and turn it into something active. I think of a system like Smalltalk or Squeak and yet what I found was a text markup language that is coming from a server and it just didn't feel like any of the immediacy that I like to feel in personal computing. I took a deep breath and thought "There is good graphics in the browser and there is a dynamic programming language there. Why do you need all this other stuff? If we have that why don't we just take that and make a little kernel out of it and see what you can get from there?"


26. But you didn't like the idea?

No, actually it's an experiment that we're playing with in the Lively Kernel. I didn't do that originally with the Lively Kernel because partly I view this as a new world and I thought to go and try and just translate Squeak into this new world. For one thing it would have been politically incorrect and for another thing I felt like that I would be carrying some baggage from the past. The only thing that I wanted to bring from the past was the experience from Squeak that it's really simple to do this stuff. You take a dynamic language, you build a graphical structure like the Morphic system in Squeak and that's all you need. I thought to rewrite it in JavaScript would be a challenge and it was and it's a little bit different, maybe better in ways and not as good in ways. But the essential thing that it preserves is that simplicity of a simple graphical structure in a dynamic language. Then, coming around the full circle to where it is self-supporting.


27. Does the Lively Kernel also have an image concept like Squeak or how does that work?

It doesn't really. Except that we're going in that direction. The Lively Kernel runs as you'd expect, it loads like a web page. When you click on its web page what comes into your computer is a pile of JavaScript which then starts running and in some cases some XML that's been stored. But we're going to change that probably. That's just running as a web page. When we get into the notion of persistence in the Lively Kernel that's up to whatever you happen to be writing in the Lively Kernel. What we have in the Lively Kernel is a WebDAV file access protocol, so that if you have an account on a server somewhere, then you can store your world as a new web page and that then is like the image model. We actually did the work so that we store XML for the existing graphics on the screen and there is a way to store also any definitions that you've made that are beyond those that come with the Lively Kernel. In that sense, once you start one of these pages, if you click on it, it comes up and you're pretty much back where you were.

We use the web page as the persistence model that's pretty much like the Smalltalk image. It has the same nice features that you store it that way from a Windows computer and you load it that way on a Macintosh in a different browser and it's all the same " it still works. That's the benefit of working with we get that for free because of these various protocols that have been a part of the Internet standards for quite a while.


28. How do you do graphics in the Lively Kernel? Do you use Canvas or SVG or what's your current approach?

The Kernel approach is the Morphic model that comes from Squeak essentially. Squeak got it from the work done on Self by Randy Smith and John Maloney and Dave Ungar, I think worked on it, too. That's basically a model of a scene graph that is in the language itself, so it's programmable and flexible. The first version that we did of the Lively Kernel uses SVG. SVG is scalable back to graphics and it's one of the world wide web graphic standards.

It's wasn't a perfect match for morphic because it presents its own scene graph and morphic has a scene graph model, so we actually had to do pairing between the components. Pairing is also an invitation to complexity, but we dealt with that. Then, about a year ago or so I did another version that would use just the Canvas protocol. Just for people who don't know about it, SVG is a retained graphics model. In other words, it includes the model for graphic nodes and if you change one, it has all the logic built in that will repaint the screen.

Canvas is a graphic standard that just gives the screen as a canvas, like a bitmap, a pixmap and if you want to change something on the screen you have to take care on repainting the damaged area. Morphic is ideally set up for dealing with that kind of updating. It just took me a week to make a layer and make a change so that all the Lively Kernel would run on the Canvas model equally well. The answer to your question "to come back to it" is that the Lively Kernel has there 2 versions: one runs on SVG and one runs on Canvas and they are pretty much identical.


29. Is there any difference in speed?

The SVG will render somewhat faster because SVG code for rendering and for dealing with updating damaged areas is all native. But what we're finding is that Canvas code runs just as fast. I think it's actually going to run faster. The reason being that the actual painting on the screen is still native in the Canvas version. We don't have to go through the pairing and updating of SVG nodes at all. You would think that managing the damaged areas in JavaScript would slow it down, but the JavaScript engines are now getting fast enough that that's not a problem.


30. Talking about speed, it strikes me that the Lively Kernel was quite courageous when you started it because it was before the Google V8 wathershed. How did that feel back then?

It was slow, but speed wasn't the first figure of merit that I was after. I wanted something that felt like a personal computer. I felt like individual programmers were getting disenfranchised, because in the good old days any computer you bought came with BASIC and you just start it up and you could write a program and have it running in 5 minutes and all very simple.

It seemed to me that nowadays we had come supposedly so much further and you couldn't do anything simply on the computer without getting something that had an inch of documentation that you had to read through and just know simple access. The figurative merit that I was after with the Lively Kernel was something that would be simple to start up, simple to program, simple to play with and if it ran slowly, I didn't care. The truth was it ran reasonably well. SVG is performant graphics and even those slow JavaScripts were reasonably performant.


32. Are you going to get an iPad and run Lively Kernel on it?

Absolutely. I say something more about the iPad and that kind of world, which is I think there is another role for the Lively Kernel, which has to do with simple authoring. The fact that the Lively Kernel is just a web page and you click on it and you're running this system, it's a system that can do graphics editing, can produce images, can do simple programs, like Squeak that you can save a page from, means that basically anywhere you have a network computer, you can have authoring.

That 's one ingredient. Another is I noticed that there is a certain challenge you have before you can become a developer for say the iPhone and yet the iPhone has Safari in it and it runs Lively Kernel applications just fine. It seems to me there is an opportunity here for something much more open than the App Store in which everybody can contribute and produce and share active content.


34. To wrap up " where can people go to contribute to the Lively Kernel or try it?

The Lively Kernel is currently in transition, and the easiest thing you can do is go a search engine, type in “Lively Kernel” and you'll end up with our latest release on what's now the Oracle site. But it's no longer an active project there and you'll see a link that will point to the Hasso Plattner-Institute in Potsdam where we're continuing it as an open source project there. [Editor's note: the link is ]

You'll find some other links there to work ongoing and we have a program for the spring where there will be a bunch of student projects being done in the Lively Kernel. There is documentation there and a mailing list you can get on. As part of the work in revitalizing the Lively Kernel community and its mailing list I really hope to get a lot of other people involved. It's something that's quite simple and easy to pick up and I think we've only scratched the surface of the cool projects that can be done in a Lively System like that.

Jun 22, 2010