Debasish Ghosh on Functional Programming, NoSQL

1. We’re here at the QCon conference, I’m speaking with Debasish Ghosh and we are going to talk about functional programming and you are going to tell us first your credentials, who you are and all the things you’re proud of.

Ok, I’m Debasish, I live in India and I work for a company named Anshinsoft, which has recently been taken over by Nomura Research Institute group of companies so logically I am now an employee of Nomura group of companies, Nomura Research Institute group of companies, in my day job we develop back office systems for trading and settlement systems, it’s mainly on a JVM stack and we have been doing this since the last 10 years and as I speak now, 10 to 12 installations are live all across the world of the back office systems which we have developed. Besides my day job I am an avid follower of functional programming, I’ve written a book a couple of years back, it’s on domain specific languages, it’s known as DSLs in Action, and apart from functional programming I am also interested in some parts of theoretical computer science as well as NoSQL databases, so my interests are a bit varied.

2. Do you get to use your varied interests in the work that you do?

I try to do that, currently we are evangelizing functional programming at our work place and we have started the process of taking some parts of our software into Scala, which is an object functional programming language on the JVM, so from that point of view it’s true, also about a year back we did, as part of my day job, some work with MongoDB because we needed to process lots of document oriented data for which we thought that Oracle may not be the best choice, in fact Oracle is the current database which we use, so we figured out an architecture where Oracle and MongoDB could coexist and they were integrated through a messaging bus.

3. In what sense was Oracle not the best choice to use exclusively?

Because Oracle is a relational database and the kind of data processing we were planning to do for that particular subsystem was sort of semi structured data, the schema was not rigid, the schema needed to be flexible so it ideally suited a document oriented database and at that point in time we decided to use MongoDB and we had a nice hybrid architecture consisting of a document database and a relational database. In fact our sample architecture, I wrote a paper in ACM at that point in time, and the architecture was appreciated because it’s a polyglotism for the data part, for those parts of the application where we needed rigid schema relational database we had the relational database and for the other part we had MongoDB and we could integrate the two nicely in terms of our architecture.

4. Now you say nicely, so I assume you achieved many or most of the goals you set out to achieve, were there many pain points in doing that, was it difficult job or is it something that in retrospect you can say “I could do that again”?

Yes, in retrospect I think I can say that I would do that again because the data requirements fit nicely with the use cases and what we did was we didn’t try to force fit a relational model onto something, onto to some data which is not relational in nature. So if I am asked to do it today I would do the same thing once again.

5. Let’s move to functional programming, which is one of the things that I love to find out about, I can’t say I know that much about it. To what extent is the functional paradigm helping you with the problems that you face in the real world?

Yes, functional programming needs a different way of thinking than object oriented programming. First of all, in order to use functional programming you need to model your behaviors as pure functions. So that I think is a radically different kind of thought process than what I am doing with an object oriented model. So, in a particular application most parts of the application consists of pure behaviors, which we can model through pure functions and that rest of the parts are side effects, so I think this is a huge thing if we can isolate or if the language that I’m using or the type system that I’m using as part of the language helps me decouple the side effecty parts from the pure parts. So that I think is a huge advantage with functional programming because the pure parts, the pure part of your application which you model as pure functions, you can resonate that much more easily if it’s pure and if it doesn’t have any side effects, you can write the unit tests much more easily, you can write properties and feed those properties as part of your QuickCheck kind of test model, so this I think is the most important part, the most glaring advantage that I find with functional programming, it helps keep the most part of your model very pure.

6. In a typical enterprise application, what you are saying is that a large percentage of your model can be represented with pure functions. […] How do you separate out the pure functions from the side effect parts and get a rich enough volume of material that you have to deal with into the pure functions?

Barry's full question : In a typical enterprise application, what you are saying is that a large percentage of your model can be represented with pure functions. Convince me of that somehow, I’m stuck in the imperative paradigm, the Von Neumann scheme, do this then do that, it’s difficult for me to think in terms of pure functions, although I try my best. Aren’t most applications a bunch of side effects? How do you separate out the pure functions from the side effecty parts and get a rich enough volume of material that you have to deal with into the pure functions?

Yes, the most important part here is to isolate the domain model, to have a clean domain model. A domain model should model the pure behaviors which the system does, there is no database there. When I’m talking about a trading system, there are few steps, there are few things that the domain model needs to do as part of the regular business logic. So I can isolate those parts as pure functions and at the end of it, I can do the unsafe parts which consists in writing into the database, writing into the file system or interacting with the user, because these are the things which are not pure. So behind the impure part, I can have the pure part modeled as a combination of pure functions which can be glued together using combinators like map or bind. The day before yesterday, as I was telling in my talk, that we can view these functions as being glued together by means of combinators.

And the advantage with functional programming is that functional programming gives you enough of a mathematical basis, these things like functors, applicatives, monads, monoids, all these have basis in set theory, algebra so from that point of view if you can implement your abstraction as an applicative or as a monoid you get tons of things for free, you don’t need to write that boilerplate code as part of your application logic. So your application logic becomes pure, it only focuses on the business part, it only focuses on the domain logic itself, so that the surface area your API implementation reduces and the person who is reading through your code, be it the user or the maintainer, he has less code to read, less code to understand because the boilerplate stuff or the glue parts are all taken care of by the abstraction which lies behind. Consider the example of a monoid, a monoid has an identity and it has an associative binary operation.

So conceptually it’s very simple, but if you can implement your abstractions in terms of monoid, you get these kinds of things for free. The example that I was telling in my talk, this use case comes up very frequently in any enterprise applications, validation, validating any business object, validating any domain abstraction. So ideally when I am looking at a screen, I have a bunch of things to fill out and when the user submits, you would expect all validations, all validation errors to come at once and not one by one. It’s very easy to implement even with an imperative mindset but the problem is that those things like accumulation of validation errors, those things get entangled with the business logic, with the domain logic, but remember if I can model it as a semi group, if I have a semi group behind, that semi group can take care of that appending part. So these are simple abstractions, monoid or semi group, but they can play a major role in removing the boilerplate stuff from your domain model.

7. Now, are there applications for which functional programming doesn’t pay off, where the amount of access to a database or access to the user screen is so heavy that most of the work that you are doing can’t be represented as pure functions?

As long as you have a decent domain model, as long as you have some domain logic, at least you can take those domain logic out of the model, out of the entire stuff and model as pure function using functional programming. So I don’t really see any meaningful application which cannot be modeled using functional programming. Maybe some parts of logic programming which is a rule based system, there logic programming can be much more effective, but functional programming as such I don’t see any meaningful application for which you cannot apply functional programming.

8. Do you find, as an advocate for functional programming, that people who just can’t get it, just can’t get their heads wrapped around that way of thinking and even if they see clear examples still refuse and still just don’t?

I think this is more of a mindset kind of thing, personally I find mastering the internals of frameworks, big Java frameworks, like Spring and Hibernate, much more difficult than understanding the basics of functional programming.

Barry: The level of abstraction that one has to start using with functional programming is higher.

You need to devote time, it can’t be mastered in a day, you need to devote time but I don’t think the basics of functional programming are so difficult that one cannot master it spending enough time. If I can master how to optimize Hibernate based applications or a Spring based application, than I definitely should be able to understand what functional programming means or at least the basics of it.

9. […] “You don’t have to waste your time describing Haskell, we’re ok with that, but we don’t understand monads, we want you to tell us about monads and what they are, and why they get around the problem of side effects”. […]

Barry's full question : I, at one time, pretended to be able to teach a Haskell course. The people who came into the course started by saying “You don’t have to waste your time describing Haskell, we’re ok with that, but we don’t understand monads, we want you to tell us about monads and what they are, and why they get around the problem of side effects”. And I didn’t have an effective answer, so that’s my question for you. Do you need a blackboard?

No, it’s ok. Actually the basic, don’t consider monads to be a separate beast kind of thing. It’s yet another part of all the abstractions which functional programming offers to you. Monads offer you the convenience that you can bind separate computations together and the type system can help you in isolating side effects from the pure function, so when I have the IO, my type system tells me that’s an IO, so if I do not annotate my types with I/O, the compiler will not allow me to do any I/O related stuff within that function. So in this sense of the term, the isolation is typically provided by the type system itself, the decoupling part, it forces upon you that you cannot do any I/O stuff within this program, within this function unless the return type is an I/O kind of thing.

10. So it sounds as if it does for side effects what strict static typing does for ordinary imperative programming, in a way. It makes it so that the compiler or some agent within the computation can check to make sure that you’re not doing something wrong.

Yes, it’s the type system that ensures that you need to annotate your type with an IO if you are doing any I/O within the program. So if by mistake you bring some I/O stuff within your function but you forget to annotate it with an IO, in that case the compiler complains. It’s really this effects system, the IO effect, it forces upon you to have that IO effect as part of the function signature, in order to enable you to do I/O within that program. So from this point of view it forces you to make this program side effecting, this side of the program side effecting.

Barry: And to announce that you are doing that to the system.

Yes, that’s true. But it’s not only I/O, monads are an abstraction.

Barry: I/O is the generic, the easiest to understand example.

Yes that’s true. But monads are a generic abstraction which allows you to do a couple of things which helps you compose larger abstractions from smaller ones. There’s a hierarchy, there’s this functor, then we have the applicative which is a beefed up functor, then there’s monad which is a beefed up applicative. So a monad specializes on applicative, so there are monads and there are applicatives, you can use either of them, so the part that we need to care of is that you need to apply the specific model of computation which is most suitable for the model that you are defining. For example, applicative gives you this sequencing effect, so in the validation example that I was talking about, there you need to sequence through all the validations, so there applicative is the suitable model. But if your domain model is such that if, if the part of the computation that you are trying to model is such that you need to fail fast, on the first issue you bail out, then it will be a better candidate for a monad. So it’s really the model of computation that suits your exact need.

Barry: Now, I understand that the functional paradigm helps enormously with parallel computation.

Yes, because of this purity thing, you can reorder certain operations so you can distribute it much more easily across your cores, because functions are mostly, since they’re pure it only depends on the input, it doesn’t depend on any external context. So you can run them as many times as you wish, you can run them independently, you can run them parallelly.

11. And in practice, does that situation where you can make use of that parallelism show it’s face very frequently when you’re writing a function?

Absolutely, in the proper type of abstractions, Rich Hickey was talking yesterday about reducers which are essentially designed in Clojure and in between the reducer, within the reducer implementation he’s doing a Fork/Join kind of thing. And he can do that Fork/Join thing because functions are pure, because he is dealing with pure functions, he can distribute it across the various cores, without the fear there will be some external context which will interfere, that fear he doesn’t have, it’s a pure function. For example, in Scala also, you can process using a Scala collection, if the processing is pure you can just do a .par parallel, you can just do a .par and the processing will be parallelized.

Barry: Sounds like magic.

Sounds like magic, right, internally also it’s doing the same kind of thing, Fork/Join kind of thing distribute the task across the various cores.

12. Ok, so we’ve talked a bit about NoSQL, MongoDB, functional programming, what else are you excited about these days?

DSLs - domain specific language.

Barry: Ok, tell me a little bit about that, about your interest in that.

Actually, my main interest in domain specific language came out of an urge to write expressive APIs. My idea was that in my regular day job, since I’m designing trading systems, settlement systems, there are lots of business analysts with whom we need to interact because they are the persons who understand the business, they will give us the requirements. But I should be able to write some code, the core business logic which the business analyst should be able to read and verify. That was the thing which I targeted for, that was the thing which excited me most.

Barry: This is an old goal, by the way.

This is an old goal.

Barry: I’m thinking of Cobol, I’m thinking of all kinds of efforts.

Yes, but here what we are doing is we are separating the entire thing into a domain model which is an abstraction of the business and on top of this I’m giving a façade for linguistic abstraction which I call the DSL, which interacts in a very user friendly way with the user. So we need the DSL, we need the expressive API, a complicated implementation, but the business user doesn’t need to bother about that implementation. I need to carve out that API, I need to design that API in such a way that it appeals to the business user, it sounds like a natural language to him, which Eric Evans calls the ubiquitous language, it’s really that ubiquitous language of the domain which needs to be expressed, in order for the fact that the business user should be able to understand it.

13. Do you have experiences, positive and negative, either side of this?

Yes I have positive experiences. In fact what we were able to do is that just by looking at the API the business user, we had a domain specific language for testing, for writing tests, the business users were able to carve out unit tests independently of the developers, just by having a look at the APIs and what they do.

14. And did it take an enormous amount of work to create the DSL in order to have this happen?

It took some amount of time, but for us, since the project was a big one, it paid off. It doesn’t pay off if it’s a very small project, it may not pay off, but in our case it’s a big project, hundreds, thousands of man years of efforts, we invested in this to come up with this kind of a language.

15. So, you’re a real life practitioner, you’re out there in the field, you also have these interests in what I consider to be fascinating theoretical avenues, where do you see the world going in terms of IT, computing, languages, all that stuff? Where’s your crystal ball?

I’m quite excited about functional programming making more and more in the mainstream. A couple of days back I checked that TIOBE index and found Haskell it’s just lurking around that top 20 range.

Barry: I like that.

This is a huge step, I think it ranked around 25 or 30, something around that, so this is what excites me right now, I’d love to see more and more of functional programming being mainstream because it allows me to express the domain model, express the domain logic succinctly, correctly and it allows me to reason about my model. So that’s the thing which I would like to see more.

Barry: Ok, I think it’s a great vision, thank you so much for being with us.

Thank you, it was a pleasure.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

Bio

About the conference

This content is in the Database topic

Related Topics:

Sponsored Content

Related Editorial

Related Sponsored Content

Popular across InfoQ