Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Avi Bryant on DabbleDB, Smalltalk and Persistence

Avi Bryant on DabbleDB, Smalltalk and Persistence


1. We are here at Qcon 2008 in London. We are sitting here with Avi Bryant. How about you introduce yourself?

Sure. I am Avi Bryant. I am the co-founder and co-CEO of Dabble DB and I am also the creator of the Seaside framework for Smalltalk. So my time is sort of split between working on the product Dabble and working on the framework and platform Seaside and Squeak that the product is on.


2. So the Seaside Squeak web framework what's the chicken and egg situation with Seaside and Dabble DB ?

It's quite clear, I mean Seaside came first; Seaside originally came about as a framework to support consulting work that we were doing. And I did consulting with the framework for probably three or four years before starting to work on Dabble, and we formed really sort of move the direction of the company towards only doing Dabble and that's really all our focus right now. But the framework came first.


3. So can I ask how did clients react when you said you wanted to implement something with Seaside or did that not bother them?

Most of the work that we were doing was either for clients who really didn't care about the technology they just wanted a solution or for clients who had already made the decision one way or another, to use Seaside or at least to use Smalltalk and so it was a natural fit. So, that wasn't a problem that we ever had in consulting, certainly I know it's something that people are very concerned about with unusual technologies. But really when you are a small company you don't need to find that many clients that are OK with your technology choices. And I think that in some ways it helps if you are in a niche technology because there aren't so many people they can come to and in some ways you may be more likely to get work by choosing a less main stream technology.


4. So I think your background is generally Smalltalk or how did you get to Smalltalk?

Yes, I was a Ruby hacker; this is sort of back in the pre-Rails Ruby days. And an objective C hacker and also getting interested in XP, getting interested in Wikis getting interested in sort lot of things that one way or another I can trace the routes back to Smalltalk and so I decided that I wanted to follow the routes back and see what was there. And really just never left. It was a period where I was doing a lot of exploration in a lot of different languages and that all just kind of the way function collapsed into Smalltalk. And so for the last seven years or so that's pretty much exclusively what I have been working on.


5. So you're a second generation Smalltalker, not an old graybeard, you're a new generation?

Definitely a new generation and that's really interesting to see and I think that I had a lot of people telling me that having Seaside there, having a good solution for doing web development in Smalltalk has really helped to have a newer generation of Smalltalkers sort of come about. And also to some extend to have people return to Smalltalk who have maybe left for the greener pastures of Java at one point or elsewhere.


6. It encouraged them to see it's possible again.

Yes, exactly. I gave this talk at Smalltalk solutions a number of years ago where I basically stole the argument that Paul Graham makes at beating the averages, which is that for web applications it really doesn't matter what technology you use. Because the user has no way of knowing. It's not like, you know, Ted [Neward] in the previous interview was saying "Well you can't deploy an application with Squeak because the user has bring up the Squeak application and they have no clue what to make of it" and that would be true if you were actually shipping them the application on Squeak. But of course if you are just writing the application on Squeak and users are interacting with it over a web browser, which is the case for Dabble DB none of our users know that it is Smalltalk, none of our competitor know, none of the people that we talked to. We always get this sort of, it's actually really interesting the reaction that we get, because it's sort of disbelief and sort of amusement but also I think kind of a respect and I mean people really like the fact that people are still using Smalltalk. Because I think a lot of people have these fun memories of Smalltalk as this system that could have been big and wasn't.


7. You mean old Smalltalkers, or former Smalltalkers, or in general?

Not, necessarily I mean just people who where there, maybe never used it but thought it was an interesting thing. But people never have a negative reaction to it, which is not necessarily something I expected.


8. Ok, so you think the roots for Seaside are in Objective C based frameworks?

Yes, I mean I said I was sort of an objective C hacker at the time and Apple, Next framework was called Web Objects. It's still around although Apple ported it to Java which I think was a mistake. Before Cocoa was really big and before there were a lot of objective C programmers because of that, I think they saw Java as being the future and they ported it to Java and since then I think Objective C has really seen a resurgence because Cocoa has really won as the way of developing application was then and I think they probably wouldn't have done that if they knew what they know now. But that really sort of killed it in a sense, because all of the objective C objects developers who loved it now had to content of it being in Java. But yes, Web objects was what really opened my eyes to the idea of having extreme amounts of session state and extreme amounts of session state in a form of a tree of state full objects that represented part of the web page and that by keeping that around between requests you could keep much more information there and you could have a much higher level of abstraction when you were building a web application than using the kind of more traditional approach, which is to have a small amount of session states other stored on the cookies or stored on the server but have the best majority of the states be passed in the URL be passed in the form parameters and that core idea certainly was taken from web objects.


9. This brings us to one of Seaside's distinguishing features is its use of continuations. Do web objects use continuations in some way or did they emulate it?

No, web objects didn't use continuations and Seaside doesn't have to use continuations now either, in the more recent versions of Seaside you can optionally use them where you think they are appropriate. But they are really not anymore a core part of the framework the way they were when we started out. And I think part of that is just that web interaction styles have changed a lot. So I mean I guess I should step back and say that the purpose of continuations really was to allow you to express multi page interactions in a kind of natural modal way in the code, so that rather than having these very separate pieces of code that tied together in a kind of state machine style for the multiple pages, you would look more like putting up a modal dialog, and say "Call this page and get an answer back, and based on that answer call this page and get an answer back, and then keep going". This could just be one method that was expressing this complex flow and the thing is that those kinds of multiple pages interactions are getting much less common in web applications because of Ajax, and that it would be more likely that you would have a complicated piece of JavaScript on the client side, maybe using some asynchronous background requests that would do this kind of complex interactions for a shopping cart for example rather that having multiple pages. And so certainly in Dabble we essentially don't use continuations although we do use Seaside. The thing that is critical to Seaside and the recent version of Seaside I think is the one idea that I would like to see wider adoption of is the use of call backs rather than sort of meaningful parameters for things like links and form inputs. So that rather than worrying about what URL a given link has, you say much as you would and say if you are doing Swing development or some kind of client side Desktop application development you would say "When someone clicks on this link, here is the closure that I want you to evaluate". Like an event listener


10. You associate it with a GUI element.

That's exactly right. So every GUI element in the Seaside application has a listener attached to it and a Smalltalk that means just has a Block attached to it. And when you submit a request, all that happens is that all of the associated event listener with whichever links or form fields you submitted get evaluated.


11. So in the background you automatically map or create some URL that maps to to this specific event handler. Does that also include state in some way?

Well the listener can close over as much state as it needs to. So a block will have any references to any server side state that are necessary and also Seaside maintains for each session sort of a tree of stateful components the same way again that a desktop application would, having a widget tree. And so you got the stateful widget tree attached to the session and the expectation is that what these events are going to trigger probably is changes to the state of that tree, and then you have a phase where you go through and you render the tree again in HTML. So that's really the cycle in Seaside, as you have this tree of widgets, the tree of widgets renders itself, in the act of rendering itself it registers all these listeners, a request comes in and the listener gets triggered and that would change the tree somehow so you render it again with some new page.


12. So by registering you keep all the state around, keep the closure of the state?

Yes, it is very resource intensive in that you have a lot of state associated with each session and that state in general needs to be kept in memory on the web server.


13. Can you mess with the closures with the state in some way? Can you access them in some ways so you can persist them in some other form, for instance to replicate them or sort them in a database, is that even a problem?

No, I mean that is a reasonable question. It's probably useful for me to talk about how Dabble DB works. So the thing to understand about Smalltalk is that Smalltalk is a Virtual Machine and like Virtual Machines like VMWare for example it can snapshot the entire memory state of the running machine. And so in Smalltalk we call these images and so if you save an image to disc, that will have all of the in memory state associated with that process, current execution stack, the current program counter that includes if it's GUI where the mouse was at that point in time, I mean just everything. Just like if you are using Windows on VMWare and then you save it. And so for Dabble DB we have one of these memory images for every customer, for every database. And that includes their data, it includes basically a check out of the code, and it includes all of their session states. And we keep this in memory during the time they are using the application, but we can save it to disk at any point and we do that quite frequently, so if they are idle even for a couple of minutes we basically swap them out to disk, and of course once it is on disk we could migrate it to another server. If that server goes down we could bring it up on a different server so you have sort of fail over and persistence both of the session states and of course of all the other states associated with their database.


14. Can you checkpoint them in some place so you could even if they, if the image is running could you copy it away in some safe place?

Yes, we have very frequent check points as well. Yes, and it's very very fast both to checkpoint and to come back from a checkpoint, I mean basically, almost literally all that happens is that it's a core dump when you are checkpointing, just the entire memory state gets written to disk so that's quite fast. And then you just in mmap it back in the memory when the Virtual Machine comes up. So these VMs spin up with almost no perceptible loop time. And that's why we can get away with just bringing it down whenever possible. So we have on any given server for Dabble DB we'll have thousands and thousands of these images, for thousands of customers who are at any one time mapped to that server but maybe only twenty or thirty of them will be in memory. So typically this would be maybe hundred meg memory images, we'll have twenty or thirty of them so we're using 2-3 gigs of memory.


15. So how do you decide when to launch off these images is there some other web server before them?

Yes, Apache is running and so the request comes in to Apache and the way that we have it set up is that each of those images has a particular port that it's configured to listen on. And so when a request comes in, Apache will look at the URL figure out which port it should be proxying the request to, but if nothing is listening on that port it knows it has to spin up the image first. There is a project recently for Rails, written in Ruby, that does similar things and I am trying to remember the name of it.


16. Rack, is it?

No, it's something Switch Pipe maybe. Yes that is right.


17. And Peter Cooper I think is working on it.

Yes, so the Switch Pipe architecture actually is the closest thing I have seen to what we do in Dabble DB. It's not identical but it's fairly similar. So if you think of us as having a Switch Pipe in front of potentially thousands of VMs but of which only handful will be operating at one time.


18. So you basically hope that not all your users have to access the databases at the same time?

No, I mean we are just relying on the statistics of it. And of course if that happened, then we would have to bring up, we would have to migrate them to other servers. Which is something that we can do very quickly.


19. So you could just get in the new server and copy the images there?

That's exactly right. But so far I think the most we have ever had on any one server, the ratio server to customers is such that I don't think we have ever seen more than maybe thirty five running images on any one server, and that is certainly within what we can handle.


20. It is interesting with this images, it sounds like Amazon EC2 with the VMWare images in a way. Just with a twenty five years old technology, Smalltalk.

exactly. One of the things that is perpetually fascinating is to see how many things that were in Smalltalk, twenty, twenty five years ago, are now becoming mainstream and accepted. Obviously just in time compilers, byte code compilation, and I am not to say that Smalltalk invented these things but they were things that Smalltalk used quite a while ago and Virtual Machines and Virtual Images. The virtualization on the server is something that is very common and we have virtualization on the server. It's just that the machines that we're virtualizing are Smalltalk machines, they are not Linux machines.


21. You are using Squeak to run Dabble DB. Have you tried using Seaside generally on other Smalltalk implementation? Are there many Smalltalk implementations?

Yes, there are many Smalltalk implementations and I think all of them at this point with the possible exception of Visual Age have a Squeak port, and have people maintaining that port. So Visual Works has a fairly actively community of people using Seaside on it, I have never used that, GNU Smalltalk has just announced maybe last week that they have a Seaside port, Dolphin has a Seaside port that I did a little bit of work on, that seams to get a certain amount of use, obviously Dolphin is a Windows only Smalltalk. There are fewer people writing server applications on that. And Gemstone for the last year has had a Seaside port and that's actually of all the other Smalltalk is the only one that I've ever done and work with is Gemstone's Seaside. And I think that's a very natural fit in that Gemstone is a Smalltalk without a UI, it's just a serve side platform, but it's very well suited to the server side and to have Seaside running on top of that and especially with their persistence engine makes a huge amount of sense.


22. Well talking about persistence, you do persistence basically by saving images.

That's sort of the first level of persistence for us is just that we checkpoint the image. Because that's something that we can do extremely quickly and so extremely frequently. We do also have background processes that are extracting the data from the images and storing it in another forms that are more compact, don't have all of the baggage that a Smalltalk image carries with it. We can use these to backup we can also use these if we need to deploy a new version of the code, we can build a clean image that doesn't have any data in it, push that out every one and then incorporate their data into it to build them a new environment and that all sort of happens automatically.


23. What's this extraction storage look like? I can't see it, is it a graph?

Well it's still a binary format, still a binary representation of objects but it's one that is not entangled with all of the kind of Smalltalk process machinery and all of the code. I mean in the Smalltalk image there is a full copy of all of the code of the application, which has some interesting properties, it means that every customer every one of these thousands and thousands of images has potentially, we try for this not to be true, but potentially a different version of the code and so if we were having a problem with one database in particular we could just go in and put breakpoints or test code, or experimental modifications on their version of the code and only their version of the code, which has been in some cases extremely useful.


24. This brings us to a point - how do you push out new versions of say Dabble DB? If you have thousands of images out, do you have to upgrade them?

Right. So we have this version of the data that is separate from the image. And so what we do is just push out a new version of the image to everybody and then combine that with their data and then got a new image. But I mean different people do different things. So Monticello which is a version control system for Smalltalk.


25. I think you wrote it [Monticello]?

Yes I developed it along with Colin Putney and it's the most common one used in Squeak right now. And it is able to take a running Smalltalk image and update it to a new version of the code which is to say it might have to remove a bunch of methods that are in the image as well as add methods and the Smalltalk image has to know how to handle, if you added instance variables for existing objects, how to migrate existing images from one to the other. It's a very different problem to push out a new version of the code when you've got these constantly running Virtual Machines than it is when you kind of kill all processed and start them up again and load the code in from scratch.


26. Does updating the class structure in Smalltalk world? I mean updating methods works but instance variables? Does that work?

Yes, the naïve thing that happens in Squeak is that if you add any new instance variables all existing instances will have them as nil. And so what you tend to do is you tend to use a lot of lazy initialization approaches. If you are adding a new instance variable you'll make sure that its accessor has lazy initialization so that if it finds it nil, it can do the right thing. We also have for Dabble DB a series of migrations that are similar to what you would have for database migrations. When you, for the first time, are using a version of the code it will check to see if there are any migrations that need to be applied. And if so it will apply them and sort of go through all of the existing instances and make whatever modifications need for them to work with new code.


27. So in Ruby we have a feature called ObjectSpace, do you have that kind of feature in Smalltalk?

That's right you can iterate through all of the instances.


28. Is there a special name or is it just?

Well the method in Smalltalk is allInstances. So you can send all instance methods to a class and you can get all of them.


29. So you talked a lot about Dabble DB. We can actually explain what it actually is? What's its area of usage?

Yes, sorry. So Dabble DB is a tool for collaboratively managing data on the web. And the market really is people that probably right now are using spread sheets, emailing around spreadsheets to manage whatever the really specific data is for their organization. So this isn't for sort of generic project management or things that might be kind of vertical application for, it's for I am managing a symphony orchestra, and I need to have a database of who plays what instrument and which concerts they are going to be playing and who the donors are to the symphony and are there any relationships between the donors and the musicians that I need to track and these end up being things that are extremely specific to this organization. But the people especially somewhere like an orchestra or small business they don't have a lot of IT support, they probably don't have a lot of money to maintain a traditional or Access database or file maker database or something like that. Certainly they don't have anyone who is going to build them a custom web application for it. And so this Dabble DB is a tool that lets them collaboratively online build a mini application that has their data model and they can do a lot of the things that they might expect from a custom web application. Like they can put forms on their website they feed stuff into the database, they can get reports out of it, they can get visualizations like maps or charts out if it but without having to know how to write any code, and without forcing them to make any kind of upfront decision like you might for a database that you are going to have trouble when they inevitably realize later that they need to extend the system or change the way the system works. And so we worked a lot on having a real time exploratory interface to the data and on having very flexible migrations if you need to change the structure of the data model to support that. And also having a much deeper notion of data types than most databases do and so for example we have a location data type, where if you type in an address it knows that "Oh, this is an address in New York state”. And so we find grouping by country it will come up in the US, we have grouping by continent it will come up in North America, if you group them by state it will come up in New York, whatever I can show you a map that rolls up all of your sales by state by state or whatever. All just for putting in an address which is a data type that most people have but traditionally it would probably just stand up with a text field in the database.


30. So basically you mentioned that you are not supposed to program it. And they can't program it, it's not a feature.

Yes, our design rule is it should not be possible to have a syntax error. And so there are formulas in that if you have multiple columns of data you can say "I want to create a new column, that is times this one”. Or if you have for example a value that is a date range you can extract the new column that is the duration of that date range and build up things like that in the same way similar to what you might do in a spreadsheet, but it's all done in an interactive kind of a point click style rather than by being done by typing something in.


31. It's a very interesting application that you obviously live of it.

Yes, it's my focus right now, absolutely. It's what the company is doing.


32. And since it's running on Squeak it also shows that you can run a platform on Squeak which isn't particularly known for the stability.

Well I think people get put off by the way Squeak looks and reasonably so. It is a real barrier to entry that when you bring up a Squeak image it looks completely different from what you are expecting and probably looks it was geared towards I mean a lot of the design choices were aimed towards children rather than towards professional software developers. On the other hand the technology is extremely solid. The core VM has a great garbage collector has a very solid implementation, the I/O support isn't as good as say the JVM I mean the filesystem support is not as good, the socket support probably isn't as performing but it's certainly good enough and really we've had no problems with it as a platform. If we had we would have just moved to a commercial Smalltalk but so far, and I mean if there were any problems we would have seen them by now.


33. Sounds good. So you also don't miss any kind of concurrency such as threads. I mean Squeak has a thing, processes or something else.

Well Squeak has green threads, which I think for a web application there's no reason to use anything else. Which is to say that we have at any one time as I said, twenty or thirty VMs running at once. That makes ample use of however many processors the machine might have. To have real threads, native threads within one of those VMs would be a waste. I mean there is no point. And the flipside is that because all the threads are within Smalltalk, they are very light weight, which makes you never worry about thread pooling, because they are extremely cheap to create, you can have designs that necessary that spin off two thousand threads and it just works. I mean this is a lot of the same stuff that people are discovering with Erlang, is that having light weight processes can actually be very valuable. And as long as your architecture is such that you do have a few native processes, so that you can take full advantage of the multi CPU architecture, I think there is nothing wrong with that.


34. So you don't see any problems with blocking I/O calls or other things?

No, all of the I/O is non blocking. And the VM takes care of that I mean it looks to your Smalltalk process like it's blocking but other Smalltalk processes run just fine at the same time. And realistically any one of our VMs is probably rarely running even multiple lightweight Smalltalk processes. In general there is probably only a couple of users at a time using any one database each database has its own image and its own VM, so the chances of there being concurrent HTTP request inside one VM are fairly low. It's not something we enforce, I mean if there are there are. But it tends not to be. So, it's really just kind of a moot point. I do find that for whatever reason and I haven't even really looked into this, that you get better throughput if you have your requests spread over a number of Squeak VMs. Having twenty concurrent requests being serviced by twenty Squeak VMs on the same machine performs better than having twenty concurrent requests serviced by let's say four Squeak VMs on the same machine. Even though in theory all the four Squeak VMs ought to be able to exercise the four CPUs or whatever the machine has. The concurrency in the Linux kernel works better than the concurrency between Squeak VM but that's fine, I mean we just know that and take advantage of it.


35. These twenty images would have to be different applications, different Dabble DB instances? Could be?

Yes, so I should say that what we expect, the profile of customers that we expect is a reasonably small team with a reasonably small data set. So that it makes sense. The number of people concurrently accessing the database is something that can be handled by one Squeak VM, the dataset is something that can all fit into memory, into the Squeak image. And the thing is that there are obviously millions of people for whom that's true. I mean there are millions of people with these small data management problems, where they have under a hundred megs of data, under twenty people that need to access this data, and those are really the people that we are targeting which isn't to say that we don't support people with larger data sets or larger teams, but it's not the majority of our customers.


36. Is there a 64 bit version of Squeak?

There is a 64 bit version, we use the32 bit version of Squeak and so we have a sort of a hard upper limit we can't have an image that gets bigger that four giga.


37. Is it four gigs or two gigs [for maximum Squeak image size] ?

I believe it's four gigs to be honest with you never come close. The largest images we see are hundreds of megs not giga bytes.


38. So let's say you would have a big customer who wanted to do some big persistence, you've mentioned Gemstone. Is that Smalltalk?

Yes so Gemstone is a Smalltalk implementation that is designed rather than having the entire virtual image being in memory, it's designed to have a virtual virtual image, it's designed to have sort of an infinitely large virtual image that is shared between many different VMs running at the same time, each of which sort of have only their current working set loaded into memory. And so if I am deploying a web application, then I would have however many VMs then again probably twenty or thirty VMs running on a machine, but rather than them mmap-ing the entire image into memory with all the objects, they would send requests for objects to basically a database server, as they needed them. And so if one VM is working on one customer's data it will sort of be bringing in those objects, or even a part of the data bringing in those objects, and kind of lazily bringing them in as they get accessed. So if I have one object that refers to another one, it's only needs telling that you need to traverse that reference that it would go and fetch it. What this means is that you can have sort of an object space that is Tera bytes big and you can spread the load to accessing that object space over however many processes or even machines that you need because you can have many VMs that are all accessing that same shared set of objects. And so that would be the obvious thing, they have done a lot of work recently on supporting Seaside and on supporting Monticello, which is the version control system I mentioned. And generally sort of supporting a compatibility layer, so loading Squeak code into it is very simple. So it would be totally reasonable for us I think to port Dabble to Gemstone, should the need arise to support a very large customer. That's not something that we necessarily have any plans to do right now, because that is not what our customer base is. But if there were a customer out there who came to us with some real need for a large dataset then it makes sense, then we can do that.


39. So how different is Gemstone from Squeak. I mean it's a quite old Smalltalk, I think, twenty years?

I'm not sure exactly how old Gemstone is but it has also has been around for quite a long time and I mean there is certainly the architecture difference that I mentioned, of having many VMs running but all kind of transactionally sharing the same large set of objects, so that's quite different. Squeak expects all of its objects to be in memory. Squeak I should say like almost any language you are used expects all of its objects to be in memory. Gemstone can swap them in as it needs.


40. I should say Gemstone is also known as an object oriented database.

Yes, but I think it's a little bit dangerous just to think of it as an object oriented database because it is also a dynamic language VM and the two are very tightly coupled. So I think it's almost better to think of it as a VM that has persistence baked in right from the start. There are object databases out there that are separate from the VM and where you have some kind of a client interface to them. And that ends up in a little bit of a different situation than Gemstone is. Gemstone also is a native code complier and so performs better than Squeak does. And Gemstone is 64 bit and I think it's mostly used these days in 64 bit . Squeak is 32 bit. I mean the obvious difference is that Gemstone is a commercial product and Squeak is open source.


41. So actually we talked about the compatibility of Squeak and Gemstone and how compatible are they - you mentioned porting Dabble to Gemstone. How big of a difference do you think there is?

It seems to me, I don't have a huge amount of experience with this, but it seems to be pretty easy at this point to commit code from Squeak and load it into Gemstone as long as you are not using, obviously there are some libraries there are available for Squeak and not for Gemstone, and if you were using those it might be a little bit harder. But I mean I think one of the nice things about the Smalltalk world is that like the Java world and somewhat unlike the Ruby world, there is a bit of an obsession with keeping everything sort of in pure Smalltalk. And if you need a pdf generation library or an XML parser rather than linking in some C library you would just build it in Smalltalk, rewrite it in Smalltalk.


42. You named that turtles all the way down.

Yes I mean that is part of it. And right, Squeak goes to this extent, I mean the compiler is also written in Smalltalk, the development environment is written in Smalltalk, everything in a Smalltalk image typically is written in Smalltalk. And what this means is that as long as Gemstone or any Smalltalk can understand the kind of trivial stuff like the format that you commit Squeak source code in along as it loads that in, as long as it understands the syntactic peculiarity so Squeak sometimes uses underscore for assignments instead of colon equals as long as the parser can deal with that then you just have to bootstrap how many libraries you need because it's Smalltalk all the way down. I don't think it would be a massive engineering effort to get Dabble going on Gemstone I think it would be reasonably straight forward, I mean it would take a little bit of time.


43. So, I will ask our audience, are there any questions? So roll all the way back when you were talking how you deploy Seaside with Dabble. Is that a standard or common deployment process?

No, I don't believe so, I know of one other startup that does use something quite similar but that's because they talked to us about it. And I think probably it's more common for people to use for example a relational database for storage, or kind of an external object database, like OmniBase one that is available for Smalltalk, there is GOODS which is kind of language agnostic object database that I wrote a client for Smalltalk a while ago. And probably just to have four or five Squeak VMs running on a server, multi threading, something that looks a little more like a traditional, more like a Ruby deployment, I think that is more common. But we have a very particular problem with very particular constraints and possibilities because of how partitioned our dataset is and that's really the thing because of the nature of our business we have thousands upon thousands of separate small datasets, and that let us have a very different architecture than someone doing like a social network which has one massive totally interconnected dataset.


44. Ok so now on to the idea of scaling – you said you have 99.9 % of customers fit into that... have low interaction, but when you get that one customer who wants to hit one VM a lot, how do you scale that one VM.

I mean the one answer is that one customer is not someone that we want, right? And it certainly would be a reasonable business decision simply to say these are the customers that we can support and these are the one percent of the customers that we are not going to support. In practice there haven't been any problems like that.


45. You haven't sent that email out yet?

Yes, exactly. We haven't had to make that decision yet and I hope that if that did happen that we would find ways to accommodate them rather than having to tell them to go away but part of being a business is figuring out who your market is and who isn't. And if we have to do that we have to do that.

Jul 21, 2008