InfoQ Homepage Podcasts Stephen Wolfram on Computer Language Design, SMP, Mathematica, and Wolfram Language

Stephen Wolfram on Computer Language Design, SMP, Mathematica, and Wolfram Language

Nov 16, 2020

Stephen Wolfram is a British-American computer scientist, theoretical physicist, and businessman. He is also known for his work in mathematics. In 2012, he was named an inaugural fellow of the American Mathematical Society.

In this episode of the InfoQ podcast Charles Humble talks to him about Wolfram Language, its origins and the influences on its creation. In a wide-ranging discussion they also cover the ergonomics of programming languages; Wolfram|Alpha’s integration with Siri, Alexa, and the upcoming integration with Microsoft Excel; how ideas from physics, such as reference frames, may be useful for distributed systems programming; and live streaming language design discussions via Twitch.

Key Takeaways

The first language that Stephen Wolfram worked on, along with Chris Cole, was SMP at Caltech in the late 1970s. It was somewhat influenced by LISP and APL, as well as more general ideas from mathematical logic.
SMP and Wolfram Language are both symbolic languages based on transformation rules for patterns. The core idea is that anything can be represented as a symbolic expression.
Working on SMP provided an opportunity to explore some radical pattern-matching ideas for how to do symbolic computation.
The fundamental operation in the Wolfram Language is evaluation, and it uses an infinite evaluation system which applies to both functions and, perhaps more surprisingly, variables.
Wolfram Language has an interesting interplay between natural language, derived from Wolfram|Alpha, and the programming language.

Subscribe on:

Transcript

00:04 Introductions

00:04 Charles Humble: Hello, and welcome to the The InfoQ Podcast. I'm Charles Humble. And this week, I'm talking to Stephen Wolfram. Stephen is the creator of Mathematica, Wolfram|Alpha and the Wolfram Language. He's the author of a number of books, including “A New Kind of Science”, and is the founder and CEO of Wolfram Research. Stephen, welcome to the The InfoQ Podcast.

00:24 Stephen Wolfram: Hi.

00:25 What were your goals for SMP?

00:25 Charles Humble: I was really keen to talk to you about the Wolfram Language, because I think it's rather unusual and fascinating. And I thought perhaps we should start by talking about its origins and, specifically SMP, Symbolic Manipulation Program, which you worked on along with Chris Cole at Caltech in the late 1970s. What was the main goal for that language?

00:43 Stephen Wolfram: Oh gosh, you're asking the story of my life here. Rough picture. I grew up in the UK. I started off being very enthusiastic about physics, and started doing physics when I was an early to mid teenager. One of the bad features of doing physics is that you have to do all these complicated mathematical calculations, which I wasn't very keen on doing as a human. I'd said it's really boring, it's mechanical, it should be automateable. So I started using computers to do that kind of thing. Then around 1979, I had become the world's largest user, I think, of the research systems that have been built to do mathematical computation by computer, and was like, "Well, am I going to get somebody else to build the thing that should be built, or am I going to do it myself?"

01:26 Stephen Wolfram: And so I, after a little bit of, "Oh, can I get other people to do it?" It's like, if you really want it done, you've got to do it yourself. This was late 1979, I said, "Well, okay, if I want to build a general system that can do all the kinds of computation that will be relevant in the difficult application area of math, but also in lots of other areas, what should that system be based on then?" So I was pretty aware of theoretical computer science and mathematical logic and so on, but I was looking back at, "Okay, we're doing this computing thing. We want to do it in as high level way as possible. How should we do that?"

01:57 What other languages did you look to for inspiration?

01:57 Charles Humble: Were there other languages that you looked to for inspiration?

02:01 Stephen Wolfram: The languages that were probably the closest to being inspirations were probably LISP and APL. But both of those had in different ways had origins in mathematical logic ideas, not necessarily as academically precisely as they might've done, but they were at least inspired by those ideas. So I got interested in understanding things from that point of view. In fact, it just so happens the history is smaller than you think, that December 7th of 1920 is a key date in the history of symbolic computation, because it's the day that Moses Schönfinkel introduced combinators.

02:33 Stephen Wolfram: And combinators today are things that people think, "Oh, that's this obscure curiosity. Nobody cares," etc, etc, etc. But I've recently been tracing the history, and the history is quite fascinating, and the way that combinators ended up leading to Lambda calculus and to lots of other kinds of things. And although combinators themselves are more pure and extreme than even we have managed to reach today, there's probably five or six different elements of making everything abstract and symbolic that combinators did, and we've done essentially all but one of them now today.

03:03 Stephen Wolfram: And I think that my original inspiration for what I did was the things that have been done in mathematical logic. I did not understand the history actually as clearly then as it happens today because I've been studying things for the centenary. But my other meta idea was, I was used to doing physics. In natural science, the main picture of what you do is, there are certain phenomena in the world, your goal is to drill down and find out what are the primitive elements that lie underneath those phenomena. And that was what I saw myself doing in that language design.

03:33 Charles Humble: And so how does that relate to what you were doing with SMP?

03:37 Stephen Wolfram: A symbolic language that is really based on transformation rules for patterns, that's what it does. So when you define a function in the modern Wolfram Language syntax, which is better than the SMP syntax, it's F of X, blank, X underscore, colon, equals, whatever the function does. What does that actually mean? That means whenever you see an expression that is the form F of anything, the blank, named X, transform it to this right-hand side. So that means that the left-hand side can have anything you want on it. It could be F of the list, X blank, X blank, which means it's just going to match pairs of identical elements.

04:13 Stephen Wolfram: Or we could define that, we could also define F of one comma two, as something different. It's just saying there are these patterns, and you make transformations for these patterns. And so in SMP, I tried out a bunch of, in a sense, radical pattern-matching ideas for how to do symbolic computation. And some of them worked really well, some of them didn't work so well, and it was a great experience in a sense. I mean, it had lots of users and things in a company and all that kind of stuff. But in addition to that, it was my first really big software project.

04:44 Stephen Wolfram: The design was a fascinating data point for me because I got to make this design, see what happened five years later or more, and see which things people understood, which things they did not understand. This core idea of transformation rules for symbolic expressions, fantastic idea, worked great. Some details, completely crazy, and got changed. You read the documentation, people write for the language you created five years later, and there are things where it's like, "Nobody understands this." And it's like, "That was a design mistake." And so, in a sense, when I came to start designing Wolfram Language in 1986, I had the tremendous advantage of having tried a bunch of things that were in a sense, even more radical.

05:26 Stephen Wolfram: I mean, it's kind of funny, there were a few things that I did, like things we call associations in Wolfram Language, just mappings, other people call them associative arrays, dictionaries, whatever else. I had those things in SMP in a very generalized way. And they were too generalized in SMP, and that put me off having those things for probably 25 years afterwards. But the conception, the chassis, the framework, it's all about transformation rules for symbolic expressions. The thing that has been a huge surprise over the last 40 years is, all the different kinds of things that you can represent in that form and what's emerged and the story of the modern Wolfram Language is a branch in the world of how you get computers to do things that's really pretty different from what people have traditionally done with programming languages and so on.

06:11 Stephen Wolfram: And it's all built on top of this symbolic computation idea, but it goes off in a very different direction in terms of using that symbolic representation to describe the world, so to speak, and to describe all the kinds of things you might want to compute about or think about computationally, so to speak.

06:27 Did you also look at MATHLAB and Formula ALGOL?

06:27 Charles Humble: So you mentioned LISP and APL as a couple of influences. Did you also look at some of the algebraic systems, things like MATHLAB and Formula ALGOL, for instance?

06:36 Stephen Wolfram: Well, yeah. So in terms of Macsyma, Reduce, Schoonschip, all those kinds of things, I knew them well, I used them. I was a big user of those things. They were very ALGOL like. Well, Schoonschip was different. Schoonschip, it's like, you've got three systems, one of them has written in LISP, one of them was written in... Well, there was another one called ASHMEDAI, it was written in Fortran, and one of them was written in CDC 6000 series assembly language. And so the question is, of the three authors of those things, which ones subsequently won a Nobel prize. And the answer is the guy who wrote in CDC 6000 assembly language.

07:11 How do the ergonomics of early programming languages compare to modern design?

07:11 Charles Humble: One of things I think it's really interesting when you look back at relatively early languages is how different the ergonomics are to modern design.

07:20 Stephen Wolfram: I was a consultant at Bell Labs for a while back in the early 1980s, and so I knew people like Dennis Ritchie and so on. And he was like, "Oh, it's really nice that in SMP you have these short command names." And it turns out that was one of the very bad ideas in SMP was short command names. But at the time, the ergonomics was different because your average user of SMP didn't know how to type. At least not type, what would you say these days? Touch type, it used to be called, I don't know what it's called these days. I think it's just type these days.

07:50 Charles Humble: Yes, I think it probably is.

07:51 Stephen Wolfram: And so it really made a difference that somebody would be able to type just three letters, punch them out. And ideas like command completion didn't exist at the time. And so those aspects of ergonomics were different. It was more at the level above the symbolic computation level at the level of how to think about mathematical computation, both systems probably more relevant. In terms of languages, I don't think they were terribly... That was not my primary source of inspiration. The fact that yes, there is still a Goto in Wolfram Language in 2020, and most people don't use it...

08:25 Stephen Wolfram: We have a lot of celebrity users and occasionally I get to see that code. And there was a chap called John Nash who was very famous in Game Theory and so on. And he was a user of ours, and I happened to see some code he wrote. And he was the only person whose code was absolutely threaded with Gotos.

08:41 Charles Humble: Really? Extraordinary.

08:41 Stephen Wolfram: I've never seen that at any other place.

08:43 How much impact did having mathematical computation as a cornerstone have on SMP and later Wolfram Language?

08:43 Charles Humble: Do you think that having essentially mathematical computation as an absolute cornerstone, something you have to hit in terms of the design, had an impact on SMP and maybe subsequently Wolfram Language as well?

08:55 Stephen Wolfram: That's a very high bar because it's a really complicated area. And things like, you have to be able to represent things symbolically. Well, yes, you have mathematical formulas. There's this whole string of things. First, there was that, then there was, we represent graphics symbolically, then there was, we represent user interfaces symbolically, then there was, you represent just a whole string of other things, whether it's now cloud deployment, symbolically, whether it's now representing entities in the world, like New York City or something symbolically, whether it's representing all these kinds of things.

09:28 Stephen Wolfram: And what's happened over the years is that I've discovered, and I don't know whether it's obvious this should be the case, that this core idea of symbolic expressions and transformation rules for symbolic expressions, it really covers all these things, and it covers them in a very, actually usable way. It's not saying, "Let's have a theory of the world, and that theory of the world is predicate logic," for example, and then saying, okay, everything's got to follow a predicate logic. This is a much more general idea than that. It doesn't work to just use predicate logic. That's a pretty poor model of representing knowledge about the world.

10:01 Why do you think this idea that everything can be represented as a symbolic expression works?

10:01 Charles Humble: Why do you think this idea that everything can be represented as a symbolic expression works?

10:06 Stephen Wolfram: Possibly, that works because of the way that we humans think about things. That is part of what I've spent my life doing, is studying computation in the wild. You take a small program, you see what it does. Big discovery from the early '80s was even these very tiny programs do really complicated things. And the question then is, they do really complicated things, but are they things we care about? And the answer is, it depends whether our technology has made us want to care about that thing. Do we need a random number generator? Do we need a thing that does this compression? Unless we have the idea of random number generation, the idea of compression, we don't care about those things.

10:46 Stephen Wolfram: I see the role of language design as being this bridge between what is computationally possible and what we humans think about.

10:54 Charles Humble: I find that fascinating because you're almost getting into linguistic theory. You're almost getting into linguistic relativity, which is this idea that you can only express ideas that you have words for. And that's very different from how computer language designers typically think about language design.

11:16 Stephen Wolfram: For me, the fact that the symbolic representation of the world works as a way to build up a computational language is probably not unrelated to the fact that we humans think about things in symbolic terms. And I think that there are things... From my point of view, our language, the analogy is probably more like something like the invention of mathematical notation than it is the construction of early programming languages. And the big thing, which is certainly an encouragement to me in terms of spending my life building up this computational language, is if you look at the history of mathematical notation, it's like 400 years old. Before that time, if you wanted to explain math to somebody, you'd be trying to describe things in words, and it wasn't very streamlined, it wasn't very efficient.

12:03 Stephen Wolfram: Then mathematical notation, plus signs, equal signs, things like that got invented, and suddenly math could take off and we got algebra and calculus, and then we got all the mathematical sciences and so on developing. As far as I'm concerned, the goal of what we're trying to do is to enable that same kind of computational language to exist as the mathematical language that came into existence maybe 4 00 years ago, and to use that to enable computational X for all X. That's the goal. And to provide people a vehicle for thinking about things computationally with the fantastic assist that, "Oh, your computer can do it too," so to speak, which wasn't a thing back in the day of mathematical notation.

12:43 Stephen Wolfram: So I think the goals have been very different, and it's almost like, "Well, what category of thing are you making?" "Well, we're a category of one," which always has problems. It's intellectually interesting, but it always it's like, "Well, what does it like? Is it like this programming language or that programming language?" Well, no, not really because it's goals are very different.

13:00 Charles Humble: Can you give me some concrete examples of those goals?

13:02 Stephen Wolfram: We have to have way to represent, I don't know, information about movies and things. Or we have to have a way as well to represent packets going on a network, and things like this. And having these kinds of representations of things in the world, so to speak, and making those representations interoperable, that's what's a lot of work, but that's what we've tried to do, and tried to really incorporate into the language knowledge of the world, so to speak. I think that the concept is just as back when I started using computers, it was just a computer with machine code. Later on, it got primitive languages and then it got operating systems and it got networking that got some UI. It doesn't yet have built in computational intelligence.

13:46 Stephen Wolfram: Our goal is to use the language that I've just spent the last 40 years building and allow that to become the source of take-for-granted computational intelligence that people can expect to see when they use computers and expect to have a way of interacting with their computers that is something which is a bridge between their way of thinking and what computers are in principle capable of, and that's the concept. Well, the practicalities of, oh, this big language and it's all coherently designed and it runs in the cloud and it runs on the desktop and it runs and servers, all those kinds of things. Those are the hard work of the engineering around what is ultimately an intellectual idea about build this full-scale computational language that can represent the world computationally and allow people think about it computationally.

14:37 Stephen Wolfram: And I think you've correctly identified a core issue there is, what is the ultimate representation of things? And the answer is. it's actually exactly the same thing as the core ideas of mathematical logic going back about 100 years that have not tended to be captured in the same way, and at least in practical languages and so on. We've been forced to do that because we're trying to represent a much broader range of things in the world. We're not just saying, "Oh, it's an array, it's a structure, it's a this, it's a that. It's like, it's got to represent a chemical, it's got to represent a molecule, it's got to have a way of doing that. It's got to have a way of doing that in a way that it can be computed with.

15:14 How does the infinite evaluation system in Wolfram Language work?

15:14 Charles Humble: And as part of doing that with the Wolfram Language, you have an infinite evaluation system. Can you briefly describe what that means and how that works?

15:23 Stephen Wolfram: You define a function, which is basically setting up a transformation rule for a pattern for symbolic expressions. For example, you say, "Well, how do you deal with object-oriented stuff?" Well, you don't really have to, because what you're doing is you're saying, F of, and then the thing that's inside that F is an arbitrary pattern. So if inside the F it wants to be dealing with G of something or other and H of something or other, that's just what you write. You don't have to say, "Oh, there's this type, and now we're going to get that from somewhere else." It's just right there in the symbolic expression that you're making a transformation for.

16:01 Charles Humble: And then how does the transformation itself work?

16:04 Stephen Wolfram: So what it's doing is, it goes along and it's evaluating things, and it sees F of G of whatever, and it says, "Okay, I know a rule for that. Let me apply that rule." People would expect it to do that infinitely for functions, it also does that infinitely for variables. I am always hoping nobody defines global variables, but they can. In the session-based, you can use Wolfram Language as a session-based thing in notebooks, or you can use it as an API. If it's an API, then whatever global thing you defined has gone by the time the API has finished executing, so that's less dangerous.

16:35 Stephen Wolfram: But yes, you can define the variable X. If you say, X equals X plus one, it will say recursion error, it will loop for a while and then... So one of the things that's interesting about language design is, you might say, "Oh my gosh, the fact that X equals X plus one blows up would kill everything. The fact that you can make circular definitions and with infinite evaluation, they'll blow up. You might say, "Oh my gosh, it'll kill everything. People will be confused." They are not. Those are bugs, basically. They're simple bugs. And it's one of the complicated judgment calls of language design. My theory is, the only evaluator that's absolutely perfect and has no weird behavior in any case is an evaluator that does absolutely nothing.

17:16 Stephen Wolfram: In other words, as soon as it does something, there's going to be something that you might consider to be a weird case, so to speak. And what's really interesting in language design is to lead people, to not rub their noses against the weird cases, to lead them, to do the things which are actually going to work and are actually going to be well-defined and are actually going to be efficient and so on. And I think that's part of the art of language design is to do that. I'll tell you something about evaluation that really brings us right up to the minute. In SMP, I actually tried to define the way that the evaluation front would go.

17:51 Stephen Wolfram: So for example, let's say you define a Fibonacci function. So you're saying F of N is F of N minus one, or in Wolfram Language, it will be F of N, blank, colon, equals F of N minus one, plus of F of N minus two. F of one equals F of two equals one. That's the Wolfram Language way of defining that. And now the question is, how does that get evaluated? So, one thing you can do is a depth first recursion through the Fibonacci tree. So in other words, you say F of 10 goes to F of nine, which then goes to F of eight, F of seven, etc, etc, etc.

18:20 Stephen Wolfram: But that's not the only way you could do it. You could say, by the time you've got F of 10 goes to F of nine plus F of eight, then F of nine goes to eight plus seven, F of eight goes to seven plus six. Now, you've got two F of sevens. You could say, "Hey, wait a minute, let me combine those before I go on doing the evaluation." So that's more of a breadth first evaluation strategy, then a depth first one. In SMP, I parameterize that. I had these ways of attributes for functions that would parameterize their recursive evaluation. And it was one of the failures of the design of SMP because nobody understood it.

18:53 Stephen Wolfram: It was a way of, parameterizing essentially the evaluation front now. So now we come 40 years later, and here I am working on fundamental theory of physics, and turns out that, again, somewhat to my embarrassment, I realized that the core idea that's needed for the fundamental theory of physics is an idea about symbolic expressions. But unlike the ones that we use in Wolfram Language where all the pieces mean something, this means a chemical, that means addition, that means whatever. The ones that occur in our theory of physics are meaningless. They are purely the infrastructure of symbolic expressions.

19:31 Stephen Wolfram: So in a sense, the universe, space consists of a symbolic expression with 10 to the 400 elements. So then what happens, and it's really a very beautiful and amazing thing, is that the transformation rules for the symbolic expressions in this big hypergraph that represents the space and the universe, you can do these transformations, you just define a bunch of transformations. You can do them anywhere assuming that the inputs are available for them. So in other words, there's this causal graph that says, "What is the chain of causality that says, 'Are the inputs ready or not?'"

20:03 Stephen Wolfram: If the inputs are ready, then you can do that evaluation. So you have all these different evaluations happening in parallel. And I'm explaining this in very computery terms. This is not how I got to this nor how you'd see it explained for your average physicist, so to speak. But what it is something where you have this giant distributed symbolic computation system. And then the big question is, what's the ordering in which those updates happen? And it turns out, there's a property that's related to the confluence property term rewriting systems. That property ends up giving one special relativity and general relativity.

20:37 Stephen Wolfram: But that property implies that essentially the reference frames of special relativity correspond to different evaluation fronts in the universe. So in other words, depending on what reference frame you're in, you're making different choices about what order you say that these different possible evaluations that could have happened are in. So it's actually very much back to the ideas that I had in SMP that were this attempt to parameterize the evaluation front now becomes the story of reference frames in physics.

21:07 Charles Humble: That's absolutely astonishing.

21:09 Stephen Wolfram: That evaluation front stuff was incomprehensible in SMP, even to me, to be honest. But now, what is really interesting is, by using the last 100 years of physics and by using all the things we know about relativity and general relativity and so on, we actually have a language to talk about this idea of reference frames and things about metrics and all that kind of thing, and event horizons and all those kinds of things. And it looks like we can use those ideas to re-import this notion of evaluation fronts back into language design, and basically give us a new way to think about how to do distributed computing.

21:45 How do reference frames help with the problems of distributed computing?

21:45 Charles Humble: I'm a bit lost there, to be honest. How do reference frames help with the problems of distributed computing?

21:51 Stephen Wolfram: In a sense, you're thinking about, "Oh, I'm programming in such and such a reference frame." That means that I'm expecting the update events to happen in a certain ordering, as opposed to being in some other reference frame that happens in different ordering. So what's going on there, and I don't know if this will completely work, but we'll see whether we're in the right century to do this, but the question is, distributed computing has been difficult to wrap one's head around how to program it. So the question is, can we now use the intellectual development that's happened in physics to give us essentially a language in a way of thinking about how to do that?

22:25 Stephen Wolfram: So this is a typical thing in language design, that is, the things that I could do in SMP, there were lots of notations, particularly in the area of functional programming. There were lots of things which I could see were theoretically doable and which I could implement, but which people wouldn't understand, like the fold operation, for example. That operation, I could put it into the SMP in 1980 and nobody would understand it. Well, we had it in SMP, we've had it from the beginning in Wolfram Language, but today, that's something that because of ambient understanding of functional programming, one is at the point where that's something, "Oh yeah. One can understand that."

22:57 Stephen Wolfram: And what I've seen in functional programming for example, is that we've gradually introduced more and more elaborate kinds of constructs that make use of more and more of these higher order function ideas. And it's an interesting thing because when one is doing language design, what one is trying to do is leverage what people already understand to let them do things that they can then get a computer to power assist them with. So the question is, what do they already understand?

23:23 Stephen Wolfram: And the biggest thing they already understand, which is something that has not been leveraged in programming languages is natural language. People know, 50,000 words, they know a bunch of concepts, they know things about how the world works. That's something that were not for natural language, there is no way we could build our computational language.

23:43 Charles Humble: Even down to the level of, for instance, naming a function.

23:46 Stephen Wolfram: I consider it the minimal version of poetry. How do you name a function? You have to name a function so that people will use the knowledge of natural language to have the right idea about how the function works. And maybe you get two words, maybe you get three for your micro poem, so to speak, about what the function does. Sometimes these things come easily, sometimes they take disgustingly long time. There are functions that we haven't had because we don't know a name for them, and it's not useful. There's a lump of functionality that you want to put in, but in a sense, there's no point in putting it in.

24:17 Stephen Wolfram: If they see that name and a piece of computational language, they see the name and it's like, "I don't know what that is." You might as well write out an idiom in terms of things they do understand. There's no point in naming it, but if you do have a name for it, it's very worthwhile to have that be a built-in function in the language, because then when somebody sees that and they see it in a piece of code, they're like, "Oh, I get what that's doing. I have a cognitive picture of what's going on there." So it's this really interesting interplay between using what people already know, which natural language is probably the biggest piece, knowledge of how the world works is another piece, and maybe 20th century physics is another piece, that we haven't really had a chance to make use of yet.

24:58 Why isn’t Wolfram Language a purely natural language system?

24:58 Charles Humble: I would like to talk a little bit more about the natural language aspects because I think the interplay between natural language and the programming language in the Wolfram Language is one of its more unusual aspects. And I was wondering why it isn't purely a natural language system.

25:14 Stephen Wolfram: I built Wolfram Language for a long time and Mathematica which is basically the same story, but intended for people doing mathematical kinds of things. Then back in the early 2000s, I started building Wolfram|Alpha. And the question was, could we use the same kind of design purity that we had used for the Wolfram Language for Wolfram|Alpha? Wolfram|Alpha was intended to be, anybody walks up to it, you ask it some random question in natural language, it gives you an answer.

25:40 Stephen Wolfram: I very quickly decided that it was interesting experience for me that I would just throw away everything I knew about language design and designing Wolfram|Alpha, because in language design, everything was about, "Let's make it perfect. Let's make it all unified. Let's make it completely coherent. Let's do all the corner cases correctly." And with Wolfram|Alpha, it was you type in a random piece of natural language, it should do what you mean. If 50 Cent is the name of a wrapper, but 51 cent is currency, then so be it.

26:10 Stephen Wolfram: In other words, just do what people expect. Natural language has evolved historically, and it's full of inconsistencies and completely crazy things. Even the pseudo natural language that includes some technical things has those kinds of imperfections. But so in building Wolfram|Alpha is very interesting experience because it was just a completely different design methodology. Before Wolfram|Alpha, I'd always been suspicious of heuristics of, "Oh, let's do this for this case, but that for that case." And it's all a bit fuzzy.

26:39 Stephen Wolfram: But Wolfram|Alpha, its natural language understanding system is a giant, at some level, it's all heuristics all the way down. The thing that was surprising about Wolfram|Alpha natural language understanding system was that people had tried to build natural language understanding systems for a long time and never been very successful at it. And I tried to use methods from computational linguistics, found them pretty useless. What I realized was the killer thing that we had that people hadn't had before is we had a target for the natural language, that is, we were just converting it to our symbolic language, and we had a lot of built-in knowledge about the world.

27:14 Stephen Wolfram: And those two things were what allowed us to actually make a successful, broad, natural language understanding system. I wouldn't have expected that, in the abstract, it was like, it's kind of an AI-ish problem, sort of general AI. It's not something that depends on having already built this elaborate computational language and having knowledge of the world, so to speak. Those wouldn't have been the things that would first come to mind in doing natural language understanding.

27:37 Stephen Wolfram: But anyway, we built up Wolfram|Alpha, and I thought for a long time, "Oh, there's a separate story from Wolfram Language, two different branches. And then I realized, what would happen if we brought these two things together? Some things are really well explained in natural language and really pretty awkward to explain in computational language. Like if you want New York City, you know you can just type in NYC. Now, that's heuristic because if you type NY, is that supposed to be New York, New York State? What is it?

28:03 Stephen Wolfram: So what we realized is, you just type Control, equals and Y, and it will bring up this thing that basically says, I think it probably defaults to New York City, I don't know, I'd have to try it to find out, but then when you press okay, more or less, what it's doing is it's turning that into symbolic entity that is New York City, New York, United States, so to speak. So in other words, we're taking a fragment of natural language and we're using that, we're embedding it in this program. Natural language is just the input method, but then it becomes a precise symbolic entity, which we can then deal with. I wasn't expecting that, that turned out to be really super powerful.

28:40 Charles Humble: So given that, why not go all the way and use natural language as the way to communicate with your computer and tell it what to do?

28:48 Stephen Wolfram: Okay. Here's what we found, when you're dealing with short utterances, that works just fine, as soon as it gets more complicated, it just falls apart completely. And so I saw that very explicitly and rather nicely, I wrote this little book called "Elementary Introduction to Wolfram Language", which was originally intended for kids, although it turns out, that seems to be a good thing for lots of adults too. But in writing that book, I did something which almost was against my principles, which is to have exercises in the book.

29:12 Stephen Wolfram: And so the exercises are basically say things like, "Write a program that does this." So in the early part of the book, the exercise is written in natural language. It's saying, "Write a program that does this." And in the early part of the book, it was easy to write the exercises. By the time I was getting later in the book, it was like, "This is pretty weird to specify what this program should be in natural language, I'm writing some bizarre piece of legalese basically to say what I want. This is not working."

29:41 Stephen Wolfram: "Oh, that's good. Because that's what I just spent my life building, was a computational language to actually be able to express these kinds of ideas in a way that you could build a giant tower that involves millions of lines of code, so to speak, rather than just the short utterance." So the thing that I found the most powerful, again, I wasn't expecting this, is this mixture of the embedded natural language short utterances that turn into a precise symbolic representation that then get embedded in big pieces of code that are represented in terms of computational language.

30:12 Stephen Wolfram: Now, one of the big things that I also have noticed is what ambient knowledge of computation is there and how can one leverage that in building a computational language. And so one of the things we've been working towards is computational contracts, where because we now have a computational language that can express things about the world, instead of writing legalese, we can write computational language because we can actually express things.

30:34 Stephen Wolfram: If the price of gold is bigger than this and it was raining yesterday in wherever, and this IoT sensor says this, then do this type thing. Or if this machine learning image identification thing says the banana is ripe, then do whatever. And so computational contracts is one of those inexorable things that we'll eventually be widely used, and as people get familiar with reading those things, there'll be a whole another level of what's possible to do in computational language, because there'll be a whole another set of things that people are routinely exposed to, routinely understand.

31:08 Stephen Wolfram: For me, it's very interesting, this process of what people ambiently understand. If you go back 100 years and you talk about universal computation, nobody would understand it. It wasn't a concept. If you explain it to kids now, it's like, "Oh, you can program it? You don't have to change computers to change what it does? It's commonplace, everybody ambiently understands that. Same thing with graphs and hyperlinks on the web, there's a certain degree of built-in understanding of what a graph is that you' get from having followed a bunch of hyperlinks on the web.

31:37 Stephen Wolfram: And it's how one leverages these kinds of things that are part of a common experience, that's part of the story of doing language design.

31:44 How do you think about language, library and data?

31:44 Charles Humble: I'm curious about how you think about language and library and data. In the Wolfram Language, it seems to be much less of a distinction between the three. So you might be working with a function from the function repository and putting in data from the Wolfram Cloud, and as a blurring of the boundaries, which seems to be part of the nature of the Wolfram Language. Do you think that's intrinsic to the nature of symbolic language?

32:09 Stephen Wolfram: Well, part of it, it's the story of computational language, because what we've done is in Wolfram Language, there is a built-in, well-designed integrated representation of images, videos, audio, graphs, optimization problems, geographic data, all these kinds of things. Those are built in features of the language, which have been the story of my life to try and design in a coherent way. And there are 6,000 built-in functions that represent all these different kinds of things in the world. And that's the main thing people use.

32:42 Stephen Wolfram: Now, we recently introduced this function repository, which is a way of adding functionality on top of that. Here's what's interesting about that, and began, I hadn't really expected in terms of design. So with libraries, the typical programming language, different story from computational language, typical programming language, it's a small core programming language, and then there's layers and layers of libraries that people use. And there's often a lot of, "Oh my gosh, I've got an incompatible set of libraries." "Oh, this blah, blah, blah, blah."

33:09 Stephen Wolfram: It is all a bit complicated and there's no guarantee that there's any coherence of design between these different pieces of libraries. So the question is, given that we have this vastly higher-level platform to start with where we have a built-in representation of, let's say, images or something, what can one then do? And what I realized is if you look at a lot of libraries out there, you'll find, well, there's this one or two really important function in that library that does something really cool. And then there's 50 support functions that deal with the fact that, "Oh, there isn't a standard representation of audio and you have to deal with audio import, you have to do this and you have to do that," and they fail.

33:42 Stephen Wolfram: And so I realized that we're in a different situation because the vast majority of what people do is going to be pure within the Wolfram Language. And if I did my job correctly, the thing that you do when you program some random microcontroller or with a whole system for doing micro-controller programming and Wolfram Language, that's going to be compatible with the things that we do in machine learning, for example. It's something where if I did my job correctly, these are all coherently designed and you can take the machine learning output and feed it into this other thing, and you can use machine learning to do this or that thing.

34:17 Stephen Wolfram: And you can feed into your machine learning thing, this weird piece of data, that's an audio combined with a this, combined with a that, because all these things have been coherently designed. It's a huge amount of effort to do that coherent design. Given that you've done that, then that one key function that you would have put in this big library is just one function, and you just have to build it on top of this tall platform of other kinds of things. So it's a different experience in terms of the library story now.

34:45 Stephen Wolfram: The function repository is quite new, and interesting story, we had a precursor of it back in 1989 and it didn't work very well. And it didn't work well because it wasn't well collimated. In other words, it was like, "Well, just put in anything." Wolfram Language is incredibly extensible, so people write all kinds of things. You can redefine the plus operation, you can do whatever you want. And it turns out what works much better is to say, "There is this huge platform you want to add this one particular piece of functionality, it's going to stick its neck out as one or two functions, just put it in the function repository. It has a standardized way of being documented, and so on." And standardized way of being used.

35:21 Stephen Wolfram: Now, the thing we haven't yet seen, it's going to be interesting piece of language design is the following track. So it's these different levels of coherence, we see it with data, for example. You have a piece of data, it's in a spreadsheet, you can read it in, then question is, how do you make that data computable? How do you take those things which actually represent cities in spreadsheet and make them actual canonical entities that you can do things with. Actually, a news part of that story, if you follow the leading edge of Excel development, you'll see that Microsoft is integrating a bunch of our stuff for doing those kinds of things within Excel as a built-in feature there.

35:56 Charles Humble: That's very cool. I didn't know that.

35:56 Stephen Wolfram: It's a coming attraction. It's out there for test users, but it's not yet fully being talked about, but it's a Wolfram Language play inside Excel.

36:04 How does using Wolfram in Excel compare to what you can do with Siri and Alexa?

36:04 Charles Humble: How does using Wolfram inside of Excel compare to say what you can do with Siri and Alexa?

36:10 Stephen Wolfram: Siri and Alexa use our natural language understanding system, but they're not exposing the actual sort of something where you're actually getting the symbolic entities, but that's something you can do in something like a spreadsheet environment.

36:23 How do you make this data more computable?

36:23 Charles Humble: So as you, as it were, kind of go up the chain and try and make this data more and more computable, what does that look like?

36:31 Stephen Wolfram: For us, it's really a painful thing because I identified at some point, there's 10 stages of making data computable. And the trouble is that at each stage, there's just a lot more work to do. For example, when we put data into Wolfram|Alpha, let's say, we're putting in data on mountains or lakes or something. A typical way somebody will access Wolfram|Alpha is say, "I want to know about Mount such and such." "Okay. Great. We have data on Mount such and such, we're all good.

36:56 Stephen Wolfram: But if we want to put data on mountains and Wolfram Language, we perfectly well know that the use case is, somebody's going to write a program that looks at mountain entities, and they're going to say, "Some geo region defined by some polygon, give me all the mountains inside this polygon." And the level of curation and quality of the data to support that has to be much higher than just, "I've got a mountain, it's got a name, tell me about it." And so what we find, and this is an interesting design process is as we get it to the point where it is part of our core permanent language, it takes a lot more work to do that.

37:28 Stephen Wolfram: And so it's a continuing issue, I think about it quite a bit, of how do we get something where we can get the, oh, it's a simple thing and we can get it done quickly in a lightweight way versus how do we have something which is a core permanent feature of the language. One feature of our language, which I'm quite proud of, is you can take a piece of Wolfram Language code that was written in 1988, and it will run today with exceptionally high probability.

37:52 How do you maintain backward compatibility in Wolfram Language?

37:52 Charles Humble: That's actually fairly remarkable. It's a constant debate for language designers about how much backwards compatibility to maintain, and it's not a thing that happens accidentally.

38:04 Stephen Wolfram: Right. In this physics project I've been doing, I did a lot of the early work on that in the early 1990s and I have the notebooks I wrote, in fact, they're even up on the web now, and it's really cool. You just read them in and they run.

38:15 Charles Humble: And how was that achieved?

38:15 Stephen Wolfram: Partly it was achieved by not making too many mistakes, and the fact that I built a language before allowed me to have already made a bunch of the mistakes I might've made. But the other thing, which is an interesting language design process, is the following thing. So let's say you have an area which you think is kind of cruddy, and you think, "Ah, we didn't do it quite right." So then what do you do about it? So what I've learnt over the years is that the thing to do is the following, just figure out what the correct design is, figure out where you're going.

38:42 Stephen Wolfram: And then you might say, "Oh my God, we'll never be able to get there. We've got this other thing that works this way, we'll never be able to get to this new design." It takes some cleverness, but it turns out you can essentially always find a bridge.

38:54 Charles Humble: Can you give me an example where you've done that?

38:57 Stephen Wolfram: I'll give you an example of a design mistake. The function set that assigns X equals seven, is set X, seven. You can write is as X equals seven. The fact that we used up the word set to represent that was a mistake. In other words, because nobody will ever type it, people will always just type an equal sign. You might as well call that assigned value to variable and nobody would know the difference, so to speak. And that was a place where we used up a good word. And so in modern times, when we're doing design where we're not quite sure, where we have a concept we think it might be generalizable, but we're not sure it's generalizable yet.

39:37 Stephen Wolfram: Don't use the general word yet, use some compound thing, and maybe at some point in the future, there will be a general case that is just the uncompounded word, so to speak. And I don't know, let's say it was audio annotate, and we have a general notion of annotation, which actually we do, but we have a calculus of annotations of things. But I think that may have started life in audio annotate, which was a defensively designed thing because we knew there was a more general concept, but we weren't ready for it yet.

40:06 Stephen Wolfram: And we knew we would get it wrong if we tried to do the design now, as opposed to five years later or something when we had more experience with it. And so it's things like that help you achieve this compatibility, but it's also this chasis of transformational rules for symbolic expressions. I don't know whether it's good luck or good judgment, but it has worked and keeps on working. And it's an interesting thing, so I think about this a lot is, how would we go beyond that? This has been like, "Gosh, haven't you had a new idea in 100 years of the development of mathematical logic? Don't we have a new idea yet?"

40:39 Stephen Wolfram: And it turns out, I will say, that we do have a new idea, and actually, when you think about something like combinators, or something like ternary writing systems, you're usually thinking about tree-structured expressions. And you're saying, "How do I take the tree-structured expression and rewrite pieces of it? What's the generalization of that?" Somewhat amusingly, the generalization is precisely what we've done in our physics project. The generalization is instead of rewriting trees, you're rewriting hypergraphs, and that's precisely what happens in our physics project.

41:08 Stephen Wolfram: And so then the question is, can you make a programming system that's based on rewriting hypergraphs instead of rewriting trees. It turns out that I tried to do that back in the 1980s, I was thinking about parallel computation, I tried to invent such a thing, I failed. And I've thought about it many times since, and here's where you run into trouble. The place you run into trouble is not in the mathematical definition of how to do things, the place where you run into trouble is us poor humans don't think very well in terms of transformations on hypergraphs just like you show somebody a big pile of combinators, SK Combinators, it's like, "What on earth is this?"

41:43 Stephen Wolfram: As I said, just was looking up, but it's coming up to the centenary of combinations. I was looking at, what have we not done that was in the idea of combinators. And the one thing we've not done is no variable names. So in other words, we have plenty of cases where we've got Lambdas, but we typically have still names of variables. Now, in the Wolfram Language, there's a hash sign which represents say, anonymous-synonymous function. It's interesting, that is like a pronoun. If I say, "Jane chased Bob and she ran fast." And we know what that means because we've got a pronoun that has a referent that goes to that thing.

42:17 Stephen Wolfram: But if we say, "The dog chased the cat, it ran fast." It's like we lost it. We couldn't do that. And in Wolfram Language, for example, that still comes up as it does in Lambdas. It's it does Lambda calculus, you've named the thing, and in Wolfram Language, we have these anonymous, anonymous functions. So you have just a hash sign, you can only use that one level, just like you can only use pronouns at one level, so to speak. And similarly, if you have two Xs on two nested Lambdas, that doesn't work either. And the question is, can you get around that.

42:49 Stephen Wolfram: An old Moses Schönfinkel, interesting chap, I'm trying to track down his history, so little is known about him. He figured out a solution to that. He figured out that you could have the symbolic structure that essentially moves the data around and just specifies, symbolically how you put this data into that argument position and so on. Unfortunately, when we as humans look at it, it's just frigging incomprehensible. And I can make these graphs, I try and make these representations of things, it's something that for whatever reason, at this time in history, at least, we don't seem to be able to wrap our heads around.

43:22 Stephen Wolfram: And this programming with hypergraphs thing, which I think is going to be part of the story of distributed computing, of being able to understand how to do this distributed computing thing, my best hope is that using the language we get and the understanding we get from physics, we will be able to make some progress on that.

43:40 Charles Humble: What do you think the limitation is there?

43:42 Stephen Wolfram: There is a limitation of our patterns of thinking, can we make 100-year plan, so to speak, where we gradually introduce people to more and more of these kinds of concepts, to the point where in 100 years, people will be able to program this way, because that will have been a concept that has been introduced, people become familiar with it, then you can keep building on it. And eventually, I don't think we'll all be writing SK Combinators. And you and I won't be around in 100 years.

44:11 Stephen Wolfram: Wolfram Language has been around for 33 years, and my general direction of symbolic computational languages has been around for 40 years. That's a large fraction of the history, that's well more than half the history of all programming language of the time programming languages have been round. And it is interesting to me the extent to which, how much things have not moved. I think to me, the thing that is both nice and frustrating is that these ideas are really the right ideas for a lot of kinds of things.

44:39 Stephen Wolfram: The world isn't quite there yet. There are plenty of people who use Wolfram Language and do these magic things with it, but your average, I'm just going to write some random piece of code person, isn't using it yet. And eventually, they will, or they'll use some ripoff of it or something, but I don't know what those timescales are. And I was trying to think about it and I realized, "Oh, my gosh, those timescales might be 50 years, 100 years. I don't know." I can see what the progress has been in 40 years, and there's been progress.

45:07 Stephen Wolfram: There's millions of people that use our language and things, including many people who do it in a very sophisticated way, but it's still shocking in a sense how little progress has been in the generic programming language idea. There's been a bunch of progress in understanding how to build large systems, but there is progress, but it's slow, it's on timescales of decades or quarter to half centuries and so on. I realized recently that we had an annual technology conference and I talked about our technology stack and I was describing some of the things that we're doing as artifacts from the future.

45:37 Stephen Wolfram: And what I found amusing was a bunch of our users came up to me and said, "You know what? That's a great description of what I do. I'm in some field and I do these things and it's like whenever things get really complicated, people come to me because I can do these things which in principle should be possible, but people just think that's impossible. That's something we can't do yet, but yet, they're just okay, it's a few lines of Wolfram Language code, we can do it." Oh my gosh, that's a magic thing that seems like an artifact from the future.

46:05 Stephen Wolfram: Now, as a practical, as one leads one's life and runs one's company and so on, building artifacts from the future is not necessarily a great business strategy. For me, it's a very satisfying intellectual strategy for the people who understand what to do and get that superpower of using it, it's a really worthwhile thing. I'm lucky that I've built a company that's a private company, which is independent and doesn't really have to report quarterly earnings and so on. So we get to think about things on a long-term basis, but we're building artifacts from the future is not one of those top IPO prospectus headlines, so to speak.

46:39 You've recently started live streaming your language design sessions on Twitch. What do you get out of that?

46:39 Charles Humble: For Sure. We are unfortunately getting awfully close to the end of our time, but I'd love to throw in one more quick question, if I may, which is, you've recently started live streaming your language design sessions on Twitch, and I was interested as to why and what you got out of that?

46:55 Stephen Wolfram: Right. I've done like 500 hours of those now, so I've got more than a toe in that particular water. The original reason was, I just thought, "These discussions were really interesting and I thought people will find these interesting, why not share them?" And what I discovered is that we get a lot of interesting people to tune in and they make suggestions in real time. And the suggestions are pretty interesting, and there are a few features in our language now that were suggested by somebody on live stream. To my knowledge, this is just not something you can see anywhere else.

47:27 Stephen Wolfram: I'd be curious, I'd be interested to see it actually, because I don't know what other people's processes is. And I think it also helps that what we do is pretty intellectual language design. Some parts of it are in the details of, we're designing some crazy feature about cloud connectivity or something that's pretty in the weeds of software engineering, but a lot of it is, "Oh, we're designing something with how biological sequences should be represented." That's one recent thing. And that's something that ends up being just intellectually interesting.

47:53 Stephen Wolfram: And I suppose for me, it probably feels a little bit more meaningful to be spending my time doing language design when it's live-streamed than when it isn't. I had thought people would attack my ideas less if we were live streaming it, but it isn't true. That's not people who would just viciously and vigorously attack things, which is as it should be, but nobody seems very inhibited by the live streaming aspect of it. And I think that it's been a terrific thing. Generally, we've been live streaming a decent fraction, whenever it isn't just so frigging obscure that nobody's going to care.

48:25 Stephen Wolfram: The things we don't live stream, we could, but we don't, is a lot of things to do with the progress of projects, where it's like, "We've got to this point, what's going to happen next?" And so on. People who are interested in project management might find that interesting, those do involve micro-design decisions. And occasionally, there are things where I'm like, "I should've live streamed this, this would have been fun. I was totally frustrated." We have a very good DevOps team and so on, but there was some issue where our cloud is somehow down and I'm like, "Have you guys fixed this?"

48:53 Stephen Wolfram: "Well, no, we can't figure it out. We're confused. We don't know what's happening." And it's like, "Okay, let me see whether I can solve this." Sort of bad CEO behavior, at some level, it turns out to be good CEO behavior in the end. And I'm like, "Okay, let's get a Zoom session going, let me see what's going on. Let's log into all these machines, let's try and analyze what's happening." And actually in the end, it was a wonderful example of using Wolfram Language because it was some issue with an upstream internet provider passing some kinds of traffic to some peering providers and not to others and so on.

49:24 Stephen Wolfram: And it was a great example. If you try and eyeball it, you don't solve the problem, but if you systematically take the data and you start making visualizations and so on, gosh, that's what's happening. Okay. Call the ISP, tell them, "Go fix this thing." And it was fixed in another 10 minutes or something. I had no idea what the outcome would be in that case, but that was one where it's a shame, we didn't live stream that one. That one would have been interesting because people would have been like, "Oh, you should try this, you should try that."

49:48 Stephen Wolfram: But the process of live streaming design reviews, I think it's been very successful. And I think it's also fun for our users when they see features coming out and new versions, it's like, "I saw that being designed. I had my chance to say that that was a bad idea. I didn't say it." Also, another point is that people understand why we made certain decisions and we are going to start linking, I think the new version is just about to come out, we'll be linking the actual design discussion to the documentation pages about different features.

50:15 Stephen Wolfram: So people say, "What were those guys thinking? This is just a dumb way to do it." They can go watch the three hours where that actually got designed.

50:22 Charles Humble: Stephen, that's absolutely wonderful. Thank you so much. I could carry on talking to you for hours. This has been a fascinating conversation. So thank you so much for joining us this week on The InfoQ Podcast.

More about our podcasts

You can keep up-to-date with the podcasts via our RSS Feed, and they are available via SoundCloud, Apple Podcasts, Spotify, Overcast and the Google Podcast. From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.