BT

Damian Conway on the State of Perl, Perl 6, Writing Parsers and DSLs
Recorded at:

Interview with Damian Conway by Werner Schuster on Jan 02, 2013 |
51:24

Bio Damian Conway is well known in the Perl community and has worked on Perl 6 for many years; he's a speaker and teacher, author of several technical books and Perl software modules, and runs an international IT training company, Thoughtstream, which provides programmer training from beginner to masterclass level in Europe, North America, and Australasia. His website is: http://damian.conway.org

GOTO Aarhus is the enterprise software development conference designed for team leads, architects, and project management and is organized by developers, for developers. Important topics are presented by the leaders in our community. The conference concept is to present the latest developments as they become relevant and interesting for the software development community.

   

1. We’re at Goto Aarhus 2012, I’m sitting here with Damian Conway. Damian, who are you?

I’m basically a mouth for hire. I’m a guy that has spent the last 10 years working mainly in the Perl community, teaching and giving keynotes and talking and just coming up with ideas, designing parts of the new Perl 6 language for example. Prior to that, I was an academic at one of the largest Australian universities, where for 10 year I taught computer science. Before that I was a PhD student; before that I was learning my craft. So that’s basically who I am: I’m a hacker turned teacher.

   

2. So what do you teach – Perl or other things?

I mainly still teach Perl, I teach a lot of Perl to a lot of people but the other big thing that I teach a lot are presentation skills, so having earned my living by speaking for the last decade completely, exclusively, I’ve learned a fair bit about how to give a good presentation and a lot of people need that knowledge. I mean we’ve all been to presentations where we wished they'd had that knowledge, so one of the big things that I’m doing now is branching out into teaching people how to give especially good technical presentations. So those are the two areas that I’m mostly teaching.

   

3. So when you teach people about technical presentations, what is the most important tip that you can give people or the main problem you see in people?

The main problem I see in the way that people give presentations is that their preparation is all wrong, they sit down and most people are told: “You have to give a talk about this or you need to talk about this”, and they sit down and you think about “What do I want to say?” and that is the wrong thing. What they need to do is sit down and think “What could I possibly say?” and you need to go through in your own mind all the things that they could say about a particular topic they are talking about and then they need to choose from that list of things what they are actually going to say; because the problem is, you sit down and you say: “I’m going to think for the next fifteen minutes about what I should say in my talk”.

Well, what you are going to get is kind of a random sampling of your brain about that topic. If you give yourself an hour or two to sit down and rehearse everything that you know about that topic, kind of just explore your own mental framework, then you can zone in on the things that are actually important rather than the things that were important to you in the fifteen minutes that you devoted to thinking about it. So I guess that I’d say that most people don’t spend enough time planning what they are going to say and then they don’t spend enough time preparing it, and they don’t spend enough time rehearsing it and it shows. Ultimately it’s just a matter of we don’t take presentations seriously enough, we don’t devote the amount of time that we need to. I generally say in my classes: “If you are going to give a one hour presentation you need to do at least twenty hours of preparation for that” and very few people do that I find.

   

4. Your other business or interest is Perl. Now we are here on InfoQ which is an enterprise site, and many of our readers or of our audience will say “What is Perl, I never heard of that, I’m too young” or “Isn’t Pearl dead?” - What do you answer to that?

I think they are perfectly right that they may never heard about it because they are too young and they went through their college degree or wherever they did and Perl wasn’t taught to them. What was taught to them was Java followed by JavaScript, and the problem is that you could say the same thing: “I’ve never heard of Clojure, I’ve never heard of Groovy, I’ve never heard of Ruby, I’ve never heard of these other languages” and I think this is a real problem that back in the day people had to be masters of many different languages because all of them did one thing fairly well.

Nowadays most languages do most things reasonably well so you can get away with only learning one or two and not have to know the rest. For those who are saying: “I remember Perl, I was dong Perl back in 1995 surely that’s died out now” - No, it didn’t, Perl did go quiet for some time, there was 6 or 7 years period where there were no major releases of Perl, it was Perl 5.8 and it was Perl 5.8 in 2002 and it was Perl 5.8 in 2007 and that often gives the impression that the language died or is not under development. But since about 2007 when Perl 5.10 first came out, we’ve seen an enormous acceleration in the development and the extension of Perl language. I think that what we haven’t not done in the Perl community is market that terribly well, we haven’t got the message out, but no, Perl wasn’t stagnating, it wasn’t dying it was in Chrysalis, it was preparing to emerge in the next stage of its evolution.

And what we have been doing both in Perl 5 and in the new Perl 6 language that I’ve also worked on over the last decade, has been to take stock, look back at our own fairly long history, Perl‘s been around for 25 years now, looked at what people have done, what people have liked, what hasn’t worked for people, looked at what other languages have been doing and where they’ve being eating our lunch in some respects, and moved the language forward in ways that meet the needs of modern dynamic language programmers. So, 7 or 8 years ago you might have said Perl looks like it’s on the way out, now I would say the Perl is experiencing a renaissance, a real rebirth and reimagining of the language and we are seeing lots of useful new features going in, lots of powerful tools, and the other thing about Perl of course is that we have the CPAN which is our repository of donated software basically, open source software. That is now toping 10 GB of source code.

There are not very many other languages that have that amount of libraries available to them, even Java I think is struggling to have quite that much, it might be close to that, but the thing about this library it’s that is freely available, it’s universally tested, everything that goes on the CPAN has a test suite associated with it. So we have not just still a lively and powerful language but we have an amazing toolkit and some people come back to Perl not because they particularly like the language and we accept that people - some of them - don’t like the language, but because their preferred alternative, be it Python, Ruby or JavaScript or whatever it happens to be, doesn’t have the tools that they need to do a particular job and the CPAN has a lot of tools.

   

5. You mentioned that since the release of Perl 5.10 the language was speeding up again, was getting more interest, but the question is of course there is always the spectre of Perl 6, so what is Perl nowadays? Is it Perl 5, it is Perl 6, is there something in between? Do features get added to Perl 5, what is the stage there?

That is an excellent question, and this represents probably the worst marketing mistake that the Perl community made in the last decade. Perl 6 is not the next version of Perl and that is confusing because Perl 5 was the version after Perl 4 and Perl did work that way for some time. Initially we thought that was going to be, we thought this is what we are going to migrate Perl to, but what Perl 6 ended up being was in fact like a sister language to Perl 5. In the same way that you wouldn’t really say that C++ was the successor of C, because if it had been then no one would be doing C coding anymore and everyone would be doing C++, but it doesn’t work that way, it’s just another choice with a different set of tools, a different set of strengths and frankly a different set of weaknesses.

Perl 6 is the same sort of thing. We started off thinking about the success of the language but what we realized was that we needed a parallel language, a companion language if you like, that had a different optimization for different kinds of tasks. So we wanted to look at for example providing strict typing in the language, dynamic typing but still strict. The problem with that is that it was really no way of retrofitting that on to Perl 5, so we don’t want to lose Perl 5 because like I said we got 10 GB of software out there, we’ve got hundreds of thousands of users who have mission critical software written in Perl, but we want to give them another option, we want to give them a way of going forward with some 21st century tools. So Perl 5 is really good for procedural programming, you can do it very quickly, get something up and running and working quickly in Perl 5. If however you want to do functional programming or very hardcore object oriented programming, Perl 5 really doesn’t have the tools for that.

So we wanted to add those features into a language, but if we’d done it to Perl 5, we would have been creating Frankenstein’s monster – “Let’s bolt OO on this side and functional on this side” and it doesn’t really work. You need to rethink the entire language, so that it’s object oriented and functional all the way down, and I guess a lot of other object oriented languages really failed to do that. Almost every other object oriented language you look at, at some level you stop being objects you get to native types and you have this discontinuity where you can do all of these clever things with all the object types or reference types, as they’re often called in other languages, but as soon as you get down to the basic types, you can’t do these things anymore, and we didn’t want that. If we were going to do this, we wanted to take a leaf out of the Smalltalk’s book and say: “Right, it’s objects all the way down”.

And that gives you enormous power because it means that a function in Perl 6 is an object, so you can pass a function around just as easy as you can pass an object, you can call methods on a function not call the function itself but you can call methods and say: “All right then, what is my return value or what are my arguments?” or I can use a function to create a higher order functions where I pass other functions into a function. So from that point of view where is Perl now? Perl 5 is an active language in active development which is pulling ideas back from Perl 6 where they are appropriate and compatible. Perl 6 is a new language which is on the verge of being production ready, next twelve months or so probably we can start using it. It’s already released, we’ve had probably 35 or 36 releases of various Perl 6 compilers, it runs quite well, we saw me demonstrating Perl 6 code just the other day in the keynote. It’s not as fast as Perl 5, but depending what else you use it for, it might be faster than that. It’s not a production ready language yet, but a newly emerging alternative if you like to Perl 5 for when you need the kind of features that is optimized for instead, a sister language.

Werner: I like the comparison of C and C++, that makes it very clear to me what I can use for what.

That comparison works in other languages as well, originally for example Python 3 was supposed to be the replacement for Python 2, so far that doesn’t seem to have happened. The Python 2 community is just as strong and is still development happening there, Python 3 is developing as well but it’s currently looking like they might end up being sister languages as well. Maybe eventually they will move across to 3, but in my outsider’s viewpoint that doesn’t seem to be happening. So I think this happens in a lot of different languages where we get to the point where we have a viable version of the language like C, like Python 2, like Perl 5, and we want to do something new with it, something better, we want to remedy some mistakes that were made in the original design, we create a new language on the assumption that everyone will migrate to it and it doesn’t happen.

You look at this and you say: “There is still a lot of COBOL code, there is still a lot of Fortran code around. Weren't there languages that were supposed to take over from those? Yes, but they don’t always do so.” If something is really good in its niche, it continues to be good in its niche, no matter what the alternatives are. Sharks in the ocean haven’t changed their design for millions of years, even though other creatures have grown up around them, because what they do, they do incredibly well. So I think for Perl, Perl 5 does what it does incredibly well, but there are things that it doesn’t do incredibly well and that is what Perl 6 is going to be for. And the same thing is true for C and C++, Python 2, Python 3, Modula-2, Modula-3, you can keep on listing them.

Werner: Ruby I think is in a similar position, they also went from 1.8 to 1.9 and took years to get anyone to use the new version.

Exactly right, and again that was all about sort of retconning from things that didn’t work as well as they hoped and to do that, but it is very hard to get people to move forward. It’s even harder when it’s not a big version difference, 1.8 to 1.9, you think everyone will do that, but if the language is sufficiently different and that difference doesn’t meet your new needs, there is no incentive to move.

   

6. Looking at Perl 6, why is it called Perl? What does it share with Perl 5, what is the DNA that it shares, that makes it Perl?

Why we don’t just give it a totally different name. Sometimes I think we should have, I think it would have been easier and would have lowered expectations and reduced the amount of fear, uncertainly and doubt as well, but the fundamental thing that it shares is the philosophy of design that Perl has always had. So one of Perl’s philosophical designs is easy things should be easy and hard things should be possible, and another one is there should be more than one way of doing it, and that is quite different from most other languages. Most other languages it is there is going to be one Java way of doing this, there is going to be one Python way of doing this. In Perl we say “No, if you want to do things functionally, we want to give you functional programming, you want to do the object oriented we want to give you that, if you want to do them purely procedurally or even declaratively, we want to give you ways of doing that.

So these philosophical underpinnings are equally visible in Perl 6, everything that we have done in the last 12 years now to design Perl 6 has been about being true to that original vision that Larry Wall had, of a language which allows you to get things done without getting in the way, of having compact syntax and of optimizing the syntax for the things that you do all the time, so the things that you do all the time take less code to write than anything else and it’s always been willing to sacrifice other things to achieve that. So that hasn’t changed at all, some practicalities haven’t changed, it’s still has the $ sign, the @ sign, the % sign in front of the variables which makes it very unusual in anything but Shell languages. But we simply find that so useful to be able to interpolate things with not having to add anything to them. We cleaned that up because we realized that it didn’t work terribly well in Perl 5 and no one understood what it meant, so what we tried to do is that we are going to keep it, we are going to make it simpler. That kind of approach has been consistent in all of Perl development from Perl 1 right through. Make it easy, get out of the way, provide built-in features that mimic the things that people want to do.

So in Perl 5 and in Perl 6 you don’t need to create your own array classes or your own associative array classes; they’re built into the language. You don’t need to write your own sorting algorithms, they’re built into the language and we want to keep doing that, we want a language which is a large language, which has quite a lot of syntax and quite a lot of tools in the language, so you don’t have to pull in 500 modules or libraries or whatever they’re called in your language to do the job, because the point is you say: “The great advantage of C is that is a very simple language, you can learn it in a week and you’ll understand it”, LISP is even so more you can learn the principles of LISP in an hour.

The problem is to do anything in those languages you need libraries, you need to have hundreds of libraries to get the job done, and in a language like Java which is not a particular simple language, it turns out you still need hundreds or thousands of libraries to get the job done. So in Perl the basic jobs that we want to do, we want it to be in the language itself but without cluttering the language up with a thousand different inconsistent tools, and that is what taking so long to work for Perl 6: How do we make it consistent and yet powerful and yet built-in, all at the same time?

   

7. For the language geeks in the audience, what are the buzzwords that it hits, that Perl 6 hits, the paradigms?

Why would you think about using at rather than Perl 5 or Ruby or Python or something like that. Some of the big things that are different, and I’ll give you a diff against Perl 5 because that is probably the easiest, if they are interested, they know probably something about Perl 5. The object model of Perl 6 is vastly better than Perl 5, for start it has one, it doesn’t just use the basic features of the language and pretend that they are real objects. So we have proper declarative classes, proper declarative methods and proper declarative fields in the classes and the objects that get automatically created are properly encapsulated, they are opaque objects, you cannot just reach in an rummage around, which is a very big problem in Perl 5.

We have much better support for functional paradigms, from the very simple thing of actually now having parameter lists in Perl 6 witch Perl 5 still doesn’t have, to the very complicated things of making it easy to pass functions around as objects and therefore make high order functions very powerful. Furthermore we have a lot more higher order functionality built in, we have a lot better support for parallel styles of programming, both data parallelism and also procedural parallelism. The various models that we have for doing threaded coding are a lot more powerful and frankly a lot safer. We don’t have these ideas of: “Let’s share all our environment” because of course then you get locking issues, we try and focus people on giving them very easy ways of doing data parallelism and of taking data and saying: “Right, I want to apply this scalar operation to a vector of data, how can I do that without having to rewrite a whole lot of code?”.

And in Perl any scalar operation that you have, be it a function or an operator, you can convert to a vector function that will take an arbitrary dimension of that data and just apply it to each cell effectively in a kind of array programming paradigm with almost no effort whatsoever, it’s eventually typing 2 or 3 characters to do it. Those are the main paradigm improvements adding a type system, that was a big must of the Perl 6 design, so we built a type system that frankly isn’t as good as Haskell’s type system, no system is as good as Haskell’s type system, but also unlike Haskell’s type system, ordinary human beings can understand it. What we wanted to create was a modern strict dynamic typing system that gives people very good introspection, that enables compile time checking where that is feasible, but falls back on runtime checking where a dynamic language will just require runtime checking.

So going from basically no type system at all in Perl 5 up to a very sophisticated and powerful and introspective type system in Perl 6 was a big improvement. So I think those are the kind of the marquee buzzwords that should have people interested in at least looking into Perl 6, but then in addition to that there is scratching the one thousand or so itches that Perl 5 had and just it didn’t do some things really well.

   

8. So going back to the type system, is it an optional type system or how does it work? I mean the language is still dynamic?

It was kind of a lie when I said that Perl 5 doesn’t have a type system, it does but not in the sense that most people think. So for example if I take a reference, which is like a pointer for those who don’t have references in their language, I take a reference or a pointer to an array and then I take another one to an associative array but we call it a hash. If I pass those around and I try to use the array references if it was a hash reference, I get an exception. In that respect that is a strong type system that you cannot just take a series of bits and reinterpret what they mean, the way you can in C for example with a cast. And it happens at runtime, we check whether you are doing the right thing at runtime, so that is a dynamic type system.

In that type system in Perl 5 there are basically only three types: you are a scalar, you are a list or you are an associative. In Perl 6 we wanted to improve that so you can say: “You are not just a scalar, you are string scalar, or you are a string scalar that is limited to 20 characters, or whatever it happens to be that you want to do”. So what we wanted to do is we wanted to say: “We want a type system that you can escalate, we still want you to be able to say this variable will take anything that you put in it”, just the way that you can do with many dynamic languages where the variables themselves often don’t have types associated with him, they are simply generic containers that take things that have an intrinsic type. And they more or less assume the type of whatever value they are storing. That can still be done Perl 6, it’s the equivalent of the object type in other languages.

In Perl 6 we wanted to allow the possibility to say: “No, this variable only stores strings” and of course anything derived from the string class in a polymorphic way. In Perl 6 if you just say: “my $var” you got a variable whose intrinsic type is a type that we call Any, and it’s the base type of most of the other types at least. So the point is that in Any we’ll store most any value you want, but then if you say: “my string $var” you are saying: “I want to restrict the types that can be stored in this”, and there is a whole hierarchy of how strict and precise you want to be, and then you can say: “my string whatever where length < 20”, and it will literarily at runtime whenever you assign to that variable not only check that it’s a string but also that it obeys this extra predicate that you specify which is the length must be less than 20. What we wanted was a type system that wouldn’t get into your way until you wanted that, and when you wanted that you could be a strict and as precise as you wanted or needed to be, and it would all just work together.

And if there were going to be problems with that, if you’ve typed your variable and then passed it a value that was of the wrong type as long as the compiler could know at compile time what those types were, you get a compile time error on that, if it couldn’t know at compile time, if it was dynamically generated data, then you get a runtime exception on it. You’re still getting strict typing but it’s strict typing that sometimes happens at compile time when it’s possible but always happens by runtime if necessarily.

   

9. So you can make the errors appear at compile time if you annotate everything?

If everything is typed in the system, then yes, the errors will be detected at compile time, but that requires that you put a type on absolutely everything, on every variable you have to declare a return type for every subroutine, otherwise at some point if you don’t declare it, it will be Any and at that point it’s effectively reduced the typing to “Is this a valid value?” and that doesn’t help you when you want to say: “Could I put it in a string?” because then it becomes “Well, some Anys I can put it in a string and some Anys I can’t". Therefore a compiler can't say at compile time. We haven’t spec-ed it yet but it might possible a compiler could warn you at compile time saying: “Look, potentially this assignment could be problematic because you are assigning a less specific type to a more specific type and we don’t know whether that is going to work or not", but certainly it shouldn’t be an error, you shouldn’t be prevented from doing so, you should just be warned “This might be a problem”, but as of yet that doesn’t happen in any of the implementations, there is nothing to prevent it, it’s just you got to get around implementing everything.

   

10. One more question about the type system, the OOP system, the object orientation system: single inheritance or multiple inheritance?

Excellent question. We don’t like multiple inheritance, like the rest of the entire universe, we’ve realized that multiple inheritance is not a good way of doing things. But unlike the rest of universe, we are not going to be fascist about it, we are not going prohibit you from doing multiple inheritance, if this is one of those rare cases where multiple inheritance would in fact be the easiest and best solution for you. So Perl 6 does provide multiple inheritance but we don’t encourage it, we don’t recommend it, what we provide instead is a full system of, and this is again something that is different in everyone’s terminology, we call them 'roles', but they are mixins or traits or interfaces or whatever you happen to call them in your particular language. So we have a complete component based system, where you can say: “What I want to build is not an entire class but a building block for a class” and then I can just compose that in to an existing class to create a new class.

So I singly inherit the main fact that this is a user for example, but now they're a tracked user, so I inherit from user and I compose in the tracked role or the tracked trait or tracked mixin or the tracked interface, so that kind of component model of OO, which has become very popular in the last 5 to 10 years, Perl 6 has extremely good support for, and all of our own built-in classes that we write in Perl 6 are specified in terms of these roles which are just composed in. And most of the type checking turns out to be role checking rather than actual type checking. We don’t care in many cases whether you are inheriting from the array class, we care whether you are iterable and you are iterable if you have the iterable role which array class does, but others things do as well. Sets do, bags, do even associative arrays do, but there are other roles that an array class would have in it as ordered-ness, but these other ones wouldn’t have, so we encourage the use of that rather than inheritance.

We think that composition is a better tool in inheritance and so we focused all that support on that. We still do provide multiple inheritance and one of the things that we stole very happily from Python was that our multiple inheritance now uses C3 resolution rather than in Perl 5 where we just got the left most ancestor and if we didn’t find anything we tried the right most. Now we use the C3 algorithm which produces much less confusion and much less problems if you are doing multiple inheritance.

   

11. Are you single threaded, do you use process parallelism, how does it work, if I want to do stuff in parallel, what do I do?

Werner's full question: Enough about inheritance and OOP, let’s talk about another big buzzword: parallelism. You talked about that. Are you single threaded, do you use process parallelism, how does it work, if I want to do stuff in parallel, what do I do?

So it’s Perl so there is more than one choice, there are plenty of choices. So we have basic constructs that provide data parallelism. You can say: “Take all this data and treat it as a single thing and every time I pass that single thing into a subroutine, automatically thread that subroutine over the each individual value, compute all the results in parallel and then collect together all those results back in to one of these single data structures, which you can then pass along to the next one.”

We also provide vector or array styles of parallelism where you can say: “Let me take a basic function or a basic operator even and let me say this normally expects to take one argument, I want you now to accept a list of arguments in that slot and I want you to thread out and in parallel process again.” So you can either put your parallelism in the data and say: “This is intrinsically parallel data, do everything to it in parallel” or you can put it in a particular function where you say: “This is an inherently parallel function or method or whatever you want to use, parallelize it over whatever data that I happen to give you”. So we are trying to support both, we are currently at the point of specifying, but we will also still be providing explicit mechanisms, because they are both basically implicit mechanisms. You don’t have to work out your control structures for that, you don’t have to work out your high hierarchy of server and client, you don’t have to control your threads, you don’t have to worry about rejoining them, you don’t have to farm them, limit them or anything, it just takes care of it automatically.

But in many cases people need that fine level of control, so we still will have a thread model, it will probably be based on ideas from languages like Erlang. So when we wanted to come up with an explicit threading model, we wanted to go to the languages that do it well, not the languages that do it really badly, and to be honest, Perl 5 its threading model it’s not terribly good, it has some significant problems and it can be very hard to use. So we wanted to look at what languages get it right, what primitives do they have, so we will want to be providing for example queue primitives, so that you can communicate between your threads without having to worry so much about synchronization or about locking or about any of these issues. What we want to provide is the kind of threading that you need. If you are just an end user who just want to get their job done and look, if I’m giving you a lot of data and you can do it in parallel, do it in parallel for me, if you can’t, just don’t bother with it, or if now I really want to say: “I’m going to have two separate servers, a thousand clients and I’m going to control the interactions between them, the communication between them directly”, we want to support both of those paradigms.

Werner: So it’s good to steal from Erlang, everybody does or copy from Erlang.

Everyone does, we steal from everybody, we are not prejudiced against any language, we’ve stolen from Python, from Ruby, from Erlang, from Smalltalk. We steal from all good languages, we even stole from Pascal. Frankly all we stole from Pascal was the := and we changed what it meant. But we did want to have at least something from that language as well.

   

12. The last language feature I want to touch on and I'm not sure it is actually still in Perl 6, I think there was some talk about adding support for PEGs or parser expression grammars. Is that still the case or is that something I read 10 years ago?

It’s kind of still the case and it’s kind of not still the case. So one of the critical things that Perl 5 has always excelled in is regular expressions. We’ve more or less set the benchmark for regular expressions over almost all of our history, about our only real competitor would be Snobol or something like that. When it came time for Perl 6, we said: “We need to ramp up to the next level, we need to add in things that aren’t there yet”, and of course grammars are the key feature there that we really do need to support. So regexes you get only so far and you need another layer of organization on top of regexes that allow you to do good recursion and so forth. When we sat down and design the Perl 6 regex syntax, we looked at all of the different grammar models that were available out there at the time, including PEGs, including Earley Grammars, including the traditional kind of LALR(1) Grammars, Recursive Descent Grammars, the whole range.

And what we wanted to achieve was something that would give people the power of most of these, without the awkwardness of any of them. What we’ve ended up with is not one pure paradigm, we’ve ended up with kind of a hybrid model where we say now that certain linear parts of a regex and they can be implemented in a pure LR kind of grammar, so that you get linear time parsing no matter what. The problem with that is that are certain things that a pure LR grammar will not parse, it doesn’t matter what you want to do, you simply can’t do that. So we then talk about the nonlinear or procedural parts of a grammar, so any grammar when you want to be able to embed code to get executed when you match a rule, can’t be reduced down to LR, it has to be at least at some level recursive or something like that.

So what we try to do then is to provide top down constructs that allow you to do that but below that what we're trying to achieve is the bottom layer, your leaf node of your parsing, will be done in a bottom up kind of way, in a state machine kind of way where it’s a linear performance, and then you cut down the amount of exponential explosion of possible matching, but you also you cut down at the bottom level the exponential explosion of possible states to be matched, and we think of that and it’s all just totally integrated, you don’t have to think about “Now I have to switch from top down to bottom up parsing”, if you use certain constructs, it will build them bottom up, if you use other constructs it will build them top down. And so that reflects many of the things that you see in things like PEGs or Earley Grammars or these far more sophisticated kinds of approaches, but it gives us a flexibility to still give you good ways of doing what are otherwise very difficult things to do in those pure paradigms.

I’m actually really excited about how powerful our new regex/grammar features are and a lot of the code that I’m writing now basically consist of: “Well I’m going to write one big grammar for it” and then anything that I need to do procedurally or functionally I’ll just do inside the grammar at various points, and that seems to work really well.

   

13. That's kind of a grammar driven programming, treat everything as a grammar?

Yes, very much so. If the information that you dealing with has a grammatical structure, then a natural way of dealing with it is to write a grammar for that structure and then at each point in the structure that you are building, apply some kind of action to it and since you can do that just right in the grammar in Perl, then very often your program consists of parse the input either data or data structure, because we have support for: “Ok, don’t just parse a string, but parse a more complicated data structure”, parse that into the grammar, have the grammar analyze and deconstruct that, have it then apply transformations or whatever you want to do, processing of any kind and have it reconstruct, because that is the other thing that Perl 6 grammars do that Perl 5 regexes don’t do, which is whenever you run a grammar over anything you get back a parse tree. And once you have a parse tree, all kinds of other transformations become trivially easy to do.

A lot of code consists of analyze and then get back a parse tree and then apply global transformations and then emit by walking the tree or they consist of analyze and in the analysis, at each node do some local transformation which produces the new result. So for the that kind of processing, and that is something that Perl always has been about, which is changing data from one format to another, that becomes really trivial when you just write a grammar for it and the grammar effectively emits your transformed data.

Werner: I find that really exciting, in other languages you always need to import some parsing library or you have to write your own PEG library, which then needs to be optimized and do all the things that you explained.

For 20 years we’ve had this in Perl, our regexes are there, you don’t have to import anything, write anything or optimize anything, so there is a lot of power that we have there already. What we wanted to do though is to ramp it up to the next level and say: “We want to be able to parse pretty much any language”, because our real problem was, we wanted to parse Perl 6 using a Perl 6 grammar, because we wanted Perl 6 to be largely self-hosting, but more importantly, we wanted the Perl 6 grammar to be available as a feature in Perl 6 itself. Because what you want to do with your language grammar if you have one, and not every language even has one, you have one, then every other tool that you want to build becomes much easier to build. If you want to build a refactoring editor, if you want to build analysis tools, if you want to build optimization tools, the first step is, arbitrary program, what is the internal data structure of it, so that I can rearrange it or simplify it or even just pretty print it for example.

Once you have a grammar that is the official real grammar of the language which you can access from inside the language and you can write a Perl 6 program that reads any other Perl 6 program and manipulates it as well as Perl can manipulate any other kind of data. And then a lot of power becomes available, you can grow a development environment that is very sophisticated, that has very sophisticated tools, and you can do it very easily, and you can adjust the tools. If there is a tool that almost does what you want with Perl source code but not quite, we can inherit the grammar because in Perl 6 grammars are objects so you can inherit them, inherit it, add in some new rules to the grammar, which is the analog of a method and now you have a new grammar but you are only writing maybe 5% of it.

In Perl 6 what we see is that we expect that there will be all kinds of domain specific languages implanted right within Perl 6. You will be able to write Perl 6 and then you are suddenly flipping to a domain specific language. And the way you’ll do that is simply by saying: “Take the parser and compiler for Perl 6, take the grammar, add in some extra rules that implement the domain specific language, pass that to the compiler”, and the internal parse tree that it emits will just have the transformation of the domain specific parts into standard Perl which will be done in the grammar itself in actions. And then the standard syntax tree goes on to the interpreter or to the compiler or whatever it is. So we see that Perl 6 is not just going to be a standard language but a language where you and the word we uses is 'braid', and we braid into that language the possibility of “In this block I’m going to write in SQL, or in this block I’m going to write in Java, or in this block I’m going to write in Ruby.”

Why? - Because there is some tool that I want to use that is much better done in those languages, so what we want to have is a way that you can just say: “At this point in the code use Ruby”, and what will happen is that “use Ruby” command will tell the original compiler you need to use an extended grammar in this scope. And because the grammars are just objects and because objects are composable in Perl 6 it would literally at runtime compose in the Ruby grammar, parse the Ruby, emit the underline opcodes or whatever they happen to be, bytecodes, and just work. So we expect that one of the real powers of Perl 6 will be it will be a great language, not just for gluing different data together, but for gluing different languages together. We see that as a very powerful thing, but to do that we needed very powerful parsing facilities built right into the language.

Werner: It’s absolutely exciting.

Whether we can actually bring it off and bring it to production-ready is another question, well I just get up now and go and do something.

Werner: I really like the idea of having the Perl grammar be available in the language itself which is something that nobody else seems to have figured out, because no other language has a grammar available.

Except for LISP.

Werner: But it has its data structures available, not the reader syntax.

But you can write a LISP interpreter in about ten lines of LISP, at that point you basically have a parser for it, but the point is with LISP it consists entirely of parentheses and atoms, so is not really hard to write a parser for that. But I agree with you: no sophisticated syntax language has its parser immediately available to you in that language except for C. If you think about it, C uses Lex and Yacc, basically or modern versions of it. But Lex and Yacc are really just C emitters anyway and they are written in C, so at some level I have that, but you are right, there is no other language that I know of where you can be in the middle of the program “Oh, at this point I need to parse more of this language, here is a tool built into the language which just does it.” The other thing that is powerful for us too is once we get to be that level of self-hosting, then development and maintenance of the Perl 6 interpreter itself gets very much easier. One of the big problems that we had with Perl 5 is the C implementation of Perl 5 is so complicated and so sophisticated and has grown so organically over time but there are only relatively few people in the entire world who actually understand how it really works, and who are therefor competent to make sensible changes to it. We’re probably talking about only a couple of hundred people who we could rely on to be able to do that.

If we can get a lot of Perl 6 eventually written in Perl 6, then we think that the potential pool of people who can take over the maintenance, add new features, improve the language, it’s likely to be very much bigger. And we think that is an important feature for the future, because it doesn’t matter what language we are talking about, a language is only as good as its implementation and the implementation is only as good as its implementers and the implementers are only as good as the community of implementers that you can draw on to do that. How many people actually understand how the JVM works? How many people actually understand how the Python interpreter works?

For every one of those communities it’s a point of failure that (God forbid!) if they all got together and a bus hit them or a meteorite, then who is going to be doing the development of that or the Linux kernel or anything else? You need to have a big community of people who understand it, and therefore you need to have it written in such a way that it’s understandable and so we think that putting a lot of the complex parts of Perl 6 eventually into Perl 6 is going to make them a lot more readable, a lot more maintainable and a lot more extensible. We hope so anyway and it is exciting.

   

14. [...]We mentioned that Perl’s perceived as sort of hidden today, it’s not very seen, where are all those people that contribute to CPAN? Where are those people, what do they do? Do they build websites, do they analyze DNA data, and what do they do?

Werner's full question: Absolutely, yes. I think we’ve given the audience enough reasons to check out Perl 6. As a final question, we mentioned that Perl’s perceived as sort of hidden today, it’s not very seen, where are all those people that contribute to CPAN? Where are those people, what do they do? Do they build websites, do they analyze DNA data, and what do they do?

The answer is that the reason that Perl is often perceived as unseen is that is everywhere. It’s like fish not seeing the water. Even organizations that say: “No, no we are a Python shop, or a Python and Java shop” or something like that, the reality is that somewhere deep down inside, some of the tools that they absolutely rely on a being written in Perl, and it might been their system administrators who are doing that, it might be their tool smiths who are doing that, it might be people who just needed to use a particular CPAN module that was the only really good way of solving their particular problem and so they had to do it in Perl. What I find as I travel around, I get to see a lot of this because organizations contact me and say: “Look we want you to come in and teach us” and it’s organizations that I wouldn’t have expected. So where are the big users of Perl perhaps we can ask, and so there is still a lot of backend Perl and a lot of website tasks. Web based companies still do a lot of it.

A lot of that is now legacy and those companies are using other things as well, but it’s still there and it’s not going away for any period of time. There is a lot of Perl used in the financial industry, in the big banks, in the trading houses, on Wall Street in general, there is a lot of Perl because they deal with a lot of an awful lot of data and a lot of incompatible formats. There is quite a lot in the life sciences, particular in things like bio informatics, so genetic research and things like that is a big consumer of Perl because there are good tools and Perl is fast enough for a dynamic language to be able to deal with the data sets that they need to deal with. So those kind of areas are the areas where I see a lot of it but then again I get brought into small companies that you would never thought of had that at all.

It riddles the world in other words, almost anyone who needs to do something dynamically and who needs to do it fast, is looking at doing it in Perl because it’s still one of the fastest dynamic languages, despite the many real strides that languages like Python and Ruby and even JavaScript have made in recent years in improving their speed. If you just look at the average speed of these things, Perl is still right up there as the fastest or the one of the fastest languages and sometimes that is what people need, we need to do dynamic styles of programming, but we need to do it quickly, and for that purpose Perl is often the right choice. It’s kind of gone underground but it’s like saying: “Where is C being used in our days?”, well C is still being used everywhere where you need speed, and it doesn’t matter if you are a Java shop or what, there will be 5 or 10 percents of your code that absolutely has to run blazingly fast and that will be implemented in C. So it’s a same sort of thing but at the next level up.

Werner: That was definitely very interesting; I think the audience will have a reason to check Perl 6 as soon as it’s out.

It’s out now, they can download a working Perl 6 compiler and play with it, is not highly optimized yet, it’s still got a few features that are missing, but if you want a keyword for that the easiest one is the word “Rakudo” and if you want to check it out that keyword will get you to Perl 6.

Werner: So let’s get googling “Rakudo” and thank you Damian!

Thanks very much!

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT