Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Steve Vinoski and Bob Ippolito on Async I/O in Python and Node.js, Web Development in Erlang

Steve Vinoski and Bob Ippolito on Async I/O in Python and Node.js, Web Development in Erlang


1. We are here at Erlang Factory 2011 in London and I am sitting here with Steve Vinoski and Bob Ippolito. So Steve who are you?

Steve Vinoski: I have worked in distributed systems for longer than I care to admit, 20 years or so, did a lot of C++ programming for years, some Java and then found Erlang. It was quite a discovery because I'd been trying to build systems like that for years and about 5 years ago I started looking at it and just really fell in love with it. So I have been doing Erlang programming in multimedia delivery for the past 4 years and I am starting at Basho next week, so I am working on Riak, staying in the Erlang community.

Bob Ippolito: I am Bob Ippolito, CTO and cofounder of Mochi Media. I have been a long time programmer since I was about 9, but professionally I had used primarily Python. But about 6 years ago I found Erlang when I was trying to build scalable and distributed systems and I fell in love with it. It’s great at what it does and almost all of our servers are written in Erlang.


2. Steve, why Erlang and not Haskell? Isn’t a statically typed programming language more up to someone who is coming from a C++ background?

Steve Vinoski: I’ve looked at Haskell but I never really tried to use it, I've been reading Brian O’Sullivan’s book: "Real World Haskell" and haven’t quite made it through there yet, but I think for me, I‘d done a lot of work in C++ but I’d always done a lot of Perl and Python and some Ruby. At my old old company, IONA Technologies, a middleware company, we had a lot of C++ code, a lot of Java code and I am looking at some of the examples we’d give to our costumers and they would be 70-100 lines long just to do something really simple. So I started looking at Ruby in particular to try to see if we could use dynamic languages to kind of cut down the lines of code and make things easier for the costumers to understand and I found you could get about a 10x reduction in lines of code and the lack of type checking really wasn’t an issue.

Then, when I found Erlang it was, like as Bob said, it's really good at what it does distributed systems and the concurrency and all that, and we had a lot of that in our systems, middleware systems. And so it just seemed like such a natural fit I never really thought about the type checking as much because of the work I’ve done in Ruby, Python, Pearl. I know some people are adamant about it, but I just like things that work.


3. You also write a column called Functional Web?

Steve Vinoski: Yes Functional Web and it’s for IEEE Internet Computing Magazine.


4. So you are trying to stray into other functional languages or just trying to show them?

Steve Vinoski: In the column we cover a variety of languages, so we’ve covered Scala, Erlang, Clojure, Haskell and had multiple columns on each language, I think. The Haskell ones obviously I didn’t write I guest columnists that join me sometimes, they write them all themselves or I coauthor if I know something about what they’re covering. So Stefan Tilkov and I wrote a column about Node.js, for example, that is Javascript and people kind of reacted and said Javascript isn’t functional, but it does have kind of functional elements to it. So yes, it’s just sort of an exploration of what you can do with these languages in web development space.


5. Bob, you come from the Python space, a dynamic language. Why Erlang and not Node.js, which is what everybody is using today?

Bob Ippolito: When I was doing my research to find an environment that I could write scalable programs in, this was in 2006 when JavaScript was still a joke basically. The Ajax was basically just catching on and using JavaScript on a server was almost unheard of, except as a scripting language inside of Java. So that is one of the reasons why not Node.js. The other reason is I had significant experience writing as scalable as you can get programs in Python and the networking libraries available there, such as Twisted or basically the same as what you have now in Node JS. It’s all callback-based event driven programming and even in the Python world people don’t like it.

They prefer the very straightforward serial, threaded approach. And Erlang gives you that approach, but it does it in a scalable way, you are not allocating megabytes of thread stacks for each socket. You are only allocating a few hundred words. So basically it’s the best of both worlds, but of course there is a paradigm shift with the functional programming language versus Python.


6. I think Python is sometimes called a LISP with a syntax. Is that true or I am misremembering that? It’s also considered somewhat of a functional language because there are lambdas and things like that.

Bob Ippolito: I haven’t really heard that much recently; there are a lot of functional elements to Python. The built-in libraries are mostly composed of functions, or the built-in functions are mostly functional, like there is map and what not, whereas in another language like Ruby for example it exposes some objects that have class methods that you would call. So there are definitely functional elements to Python and you can program in a functional style, but I think that most people program in a sort of imperative object oriented style whereas the functional style is more maybe how you might implement the method, but you still have the classes and methods involved.


7. In your column Functional Web, one of the good aspects of Erlang is the crash-proof programming, multiple processes. Is that something that you see in other languages, have other languages picked it up in some way?

Steve Vinoski: They see it in Erlang and they try to emulate it, I think most languages when you are running a website you have multiple servers anyway and the element of having always available services is just kind of built into the domain, if you will, people know they have to keep these websites up so they go to whatever length they need to keep them up. And so you are looking at load balancers, multiple servers and machines, multiple processes per machine, so it’s not as big a deal I would say in a web server world when you are talking about what language you are using because you know you have to keep the thing up anyway.


8. I guess you also have to think in the similar way to the supervisor trees in Erlang where you have to make sure that, if something crashes, you have to start it up. It’s not like in Java where you sort of cross your fingers and hope that nothing crashes.

Steve Vinoski: People use various operating system capabilities to make things restart on demand and have processes that watch other processes and that sort of approach and that is very similar to what Erlang gives you out of the box.


9. What is your toolkit - the Mochi toolkit?

Bob Ippolito: We’ve got several libraries with the name Mochi in it. We’ve got a Mochi Kit is a JavaScript framework that I wrote back in 2005 before there were many other Ajax toolkits to choose from. In Erlang we have a web framework we call MochiWeb. It’s more of a library for building web servers than really a framework, but it does include a large library of utility functions for doing common things like cookies and encodings and stuff like that.


10. Is that somehow comparable with what you can do in Python with WSGI or is it at a different level?

Bob Ippolito: It’s sort of at a different level. WSGI is more about abstracting how you call Python code and return a result whereas MochiWeb is more concerned with the actual HTTP layer it manages, basically a web server and gives you a very sort of thin veneer to talk to that socket once the web server is talking to the web browser.


11. What are some example applications for MochiWeb?

Bob Ippolito: The first application we built with MochiWeb was we wrote our Python web service that collects analytics from Flash Games, the second one was our ad server that serves ads into Flash Games. Since then we’ve got a whole number of other services, most of them speak to Flash clients, using binary protocols which is something Erlang is great at, but HTTP is really convenient for talking to anything on the internet, so that is why it’s a web server and not anything else.


12. How does MochiWeb stack up to Rails?

Bob Ippolito: I would put MochiWeb more in the, I don’t even know what web servers are named in Ruby, maybe like Webrick, I am not a big Ruby user. So it’s really just sort of the lowest level component that Rails is going to use. It doesn’t provide the rest. The only comparison I would make is that MochiWeb also ships with a nice script that allows you to create a new application that depends on Mochi Web, which is one of the things that I guess Rails innovated was being able to create a project from a template, so that people instead of staring on a blank folder, they would be staring at some source code files they could just make small edits to. I think that is the only real similarity.

It doesn’t provide any sort of database abstractions, no templating libraries, so none of the conveniences that Rails gives you, but you certainly could built something more Rails-like on top, like I believe Nitrogen might be a better example which is a web server framework that uses MochiWeb internally.


13. So Nitrogen builds abstractions on top of MochiWeb? Everything like routing or HTML generation - that is what Nitrogen does?

Bob Ippolito: Exactly right. I believe so. I haven’t personally used Nitrogen myself, I’ve only looked at it sort of in passing, so I can’t speak to exactly what features it has or doesn’t have.


14. With Mochi Kit at being a kind of web server building kit, Steve you are on the Yaws team, how does that compete with the Mochi Kit?

Steve Vinoski: In terms of Yaws and MochiWeb I think some people use one, some people use the other. I think, and Bob can correct me if I am wrong, but Bob needed more of an embedded server because Bob had used Yaws in the beginning I think and needed more of an embedded server, so instead of having a standalone server with some its code stuffed inside he wanted his code to have a server inside. And at the time Yaws had some capability but it was a little clunky back there that was before I joined the team. So I think Yaws has a ton of features that MochiWeb probably doesn’t have, just because of that difference of the embedded focus vs more of a standalone focus.


15. So Yaws is trying to be Apache in some way?

Steve Vinoski: Yaws was started by Claes Wikström, he goes by the name Claka, and this is 2001 he was going to build a web site for allowing floor ball players to register, say sign up: "I’ll be there on Thursday night" or whatever. So he started with Apache the and the LAMP stack, basically and in his own words he was horrified, so he started building Yaws instead. He’d done a lot of Erlang to that point, I mean he invented parts of Erlang like the bit syntax and distributed Erlang and Mnesia and things that we use every day. So he was very familiar with Erlang and he had never done any web programming with it before, so he started building Yaws. The first commits were in January 2002 and he never finished the floor ball site though, he finished Yaws instead.


16. That is a tradeoff, but the Erlang community got something out.

Steve Vinoski: Yes, right. And I was looking for a web server in 2007, I needed something that could really scale well. And I was doing some work on set-top boxes at the time, so the plan was to have thousands of set-top boxes coming into a particular machine and I was looking for something that could do, say like 30000 connections relatively easily and did some testing and there is this famous Apache versus Yaws graph that shows it going to 80000 connections. I was able to get to 30000 reasonably so I chose Yaws at that time and when I started working with that I found a couple of things and send some patches to Claka I think in 2008 he made me a committer.


17. Is there any special approach in Yaws that makes use of Erlang features that allows it to scale? How’s it get fast compared to other servers?

Steve Vinoski: A lot of the stuff is built into the VM itself. In my talk that I gave here at Erlang Factory I said that writing web servers in Erlang is actually pretty straightforward, the HTTP packet decoding is built into the VM, so you could set up your socket and say: "I am expecting HTTP packets" and it will give you the headers and things as data structure. So that’s taken care for you the polling of the socket and all that is in the VM so you are really just getting these messages from the VM that say: "Here is an incoming HTTP request." Yaws has a process pool that it keeps so it has little cache of Erlang processes that are listening or accepting incoming connections.


18. OS processes or Erlang processes?

Steve Vinoski: No, Erlang processes, it’s just one OS process. So I think that helps even though creating a process in Erlang is extremely fast, we just keep a small pool handy and they just pick up and accept. But yes, a lot of it is really due to the VM itself.


19. How do the Mochi products make use of Erlang, special Erlang features?

Bob Ippolito: In most of our uses of MochiWeb we're speaking with Flash clients or in many cases we’re actually dynamically generating various binary formats that Flash uses, in some cases the Flash bytecode itself. And Erlang due to the way it allows you to send IO lists and the way that it has this binary syntax that allows you to very easily disassemble and assemble even the strangest binary structures was really helpful to us. I had some prototype Python code that was maybe 3 or 4 times as long and much less straightforward. And the SWF format is as bad as you can get with variable bit length fields here and there and some places Little Endian and some places Big Endian and Erlang just chews right through that stuff and other than that we have a lot of custom in-memory databases and various other servers have taken advantage of Erland distribution, concurrency and ETS tables basically almost everything Erlang has offered we’re using somewhere in one of our servers.


21. Why maybe? The JVM is great, it’s amazing, it stunning. We're at InfoQ, we have to say that.

Bob Ippolito: I think it’s a very interesting project. I am not entirely sure that it’s ready to run all of our production software right now, especially with the rapid adoption of NIFs by many of the projects that we use, the native interface function, basically C code which I think would be a lot more cumbersome to integrate with Erjang or if not cumbersome then slow to go through the JNI to call these functions.


22. So in Erlang you do load native functions into the VM? It’s a custom written function or are these provided by Erlang?

Bob Ippolito: In some cases. These are custom written C functions that are loaded into the Erlang VM.


23. They are kind of like Erlang drivers?

Bob Ippolito: Basically they can be used like drivers or they can be used just as sort of a replacement of something that would be slower to do in Erlang like calculating a hash or a checksum those kind of byte based numerics are often much faster to do in C.


24. I think Steve here wrote a SHA-1 library. Did you write this as a NIF or some external way?

Steve Vinoski: I originally wrote it, it's SHA-2, so it’s the 256-384-512, functions and I originally wrote it I think just on a whim, I just saw some questions on a list that someone was looking for these and said: "I could do that in Erlang". So I wrote it initially in pure Erlang and it worked and some people used it, but it’s slow, as Bob said, some of those kinds of calculating functions are just slow in Erlang, people know that. So recently I rewrote it in C, using NIFs and it’s quite a bit faster. If you run the test suite for the functions I think it takes about 30 seconds on the original code and less than a second on the NIF .code.


25. What do I have to watch out for when I write a NIF? A NIF is inside the Erlang process so I can step on some of its toes?

Steve Vinoski: Yes, definitely, it’s C code, so you can be as dangerous as you like and if you are too dangerous you are going to crash the VM and that’s what makes it hard. NIFs are a lot easier than drivers. Drivers have a particular kind of an entry table that you have to fill in certain functions and the VM is going to call you back. A NIF just looks like an Erlang function from the outside and it happens to go into the C code. So you can have threads and all kinds of stuff in NIFs as well, but it’s generally not used that way and for me personally when I have to write things that need threads I’d go the driver route.


26. So do NIFs have sort of API to the Erlang VM or they just call it in some way?

Steve Vinoski: There is an API. There is types you have to use to represent Erlang terms within the C code and then there are functions that let you operate on the terms, create the terms and there are functions that help you with allocating memory and those sorts of things. It’s a really nice API, the NIFs are pretty simple to write, so I think, as Bob said, you are starting to see more people doing that for certain elements of their computation or the problem they are trying to solve just because you can go faster sometimes or they need to integrate with something that is written in C. For example I had to write a UUID NIF just to call the UUID library on LINUX for one of my projects, so it’s very simple.


27. So this would be something that has to be called frequently that we couldn’t put in an outside process and communicate with it?

Steve Vinoski: Right, it’s very fast.

Bob Ippolito: Frequently or with a large amount of data, you don’t always want to send megabytes over a socket when you don’t have to.


28. How does the NIF get data from Erlang? Does it get a binary dump of the data or is it a link to the Erlang data, if I pass some data from Erlang into a NIF what do I get?

Bob Ippolito: I believe you get direct references to the terms, so it’s basically zero copy. You may be allocating data on your way out, but on the way in there should be no copies.


29. So you can actually fiddle with the immutable code data structures?

Bob Ippolito: I believe that anything is possible in C, but it’s certainly discouraged.

Oct 10, 2011