Bio Sadek Drobi, CTO of Zengularity, is a software engineer specialized in design and implementation of enterprise applications with a focus on bridging the gap between the problem domain and the solution domain. He works on the design and implementation of the Play framework as well as prismic.io ( http://prismic.io/ ). twitter: @sadache blog: http://sadache.tumblr.com company: www.zengularity.com
Software is Changing the World. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.
I’m Sadek Drobi and I work at a company called Zenexity, actually I’m CTO of Zenexity and maybe probably people know me from the Play Framework, we created the Play Framework especially me and Guillaume Bort the founder of the project and especially I was involved in the Play 2 Framework which we mostly implemented in Scala, Scala and Java to come up with a web framework for the JVM, so that is basically what I’ve been maybe known for and that is what I was working on for the last year.
Some of them are purely functional programming, basically I showed the JSON parser that is purely an Applicative, Applicative in some cases maybe an alternative, these are functional programming abstractions. I didn’t mention that they are applicative because maybe that tends to scare people a little bit, but Applicative allows to turn these individual parsers and combining them, combination it’s an interesting thing, so we combine them into bigger parsers and so on, and these properties are very nice because they are composable, imagine that you have JSON instead of putting all the parsers inside one place, you can put them into vars and you turn them around and you reuse them because it’s very flexible, is in a way as flexible as lists and functions on them and so on, like parsers or parser combinators, they are very flexible as well. This kind of properties are purely functional programming concepts, applied to some problem. Of course there is a lot of reactive kind of programming but I use it Iteratees and Enumeratees to show like how these things can be interesting in functional programming, this stuff can be interesting in real-time web where you want to stream stuff and push stuff into the server into streams and filter these streams and manipulate these streams using the same kind of aspects and abstractions that are used also for the parsing. It’s all about functional programming, use it for a list, use it for a stream of data, you can use it also for parsing some stuff, JSON, all these things are simply just functional programming concepts.
4. The functional reactive part, you mentioned Iteratees and all these things, how do they help with pulling data from multiple sources, what is the functional bit in there, isn't that mostly just imperative?
You receive data and now and this data I could take it to a very, very small example, if you are going to get a list in Java you do a loop and then you create a new list and for each thing you’ll make some transformation and you add it to that list. Now that's a very imperative kind of way, with functional programming you’ll go ahead and say like: “List.map” and you give it away, so that is higher order function, you give it the function that describes exactly how to transform every element of data, that gives us Monads and all these kind of things. But the interesting stuff about it is that is higher order, but there also other concepts that are higher order that can help us also do this kind of transformation, composition of different tools to make a parser. For instance a parser of JSON could be a like JSON could have like several parts into it. Let’s say, in a tweet you have the main information but then you have media attached. You can do different parsers for these two guys and then combine them and saying like: “My JSON is both these guys”, maybe someone else would take the media part and other part and say like: “This is my JSON rather than this one” and even inside this small media it can become also different parsers combined. It’s parsing a string, string and a third string and it makes the media part and so on. So that is very kind of functional, that is the functional part about it.
5. You were hosting the (Post)Functional track here and you had people like Erik Meijer who promotes the Reactive Extensions, do you like Reactive Extensions, their approach or is your approach different?
This area is a very interesting area, we are talking about streams of data and how can we handle this kind of streams of data, and me to handle them more declaratively like we would handle a list like filter.map and whatever it is, and we get what we want, and that is very interesting stuff. Which solution is better? You need to do a trade-off to and to say like: “I’m going to accept this kind of properties and throw away this kind of properties” so you can get what you want. To me I think it’s unclear for now about this area, what is the best approach, not only have Reactive Extensions and Iteratees, but you have also like a lot of other abstractions, like Conduits, you have Pipes, you have machines and you have a lot of other approaches, trying to solve exactly this problem but from different perspectives.
If I look at the Reactive Extensions concept, it’s slightly simpler to understand, because it’s smaller like Iteratees tend to be a bit more powerful and they can do other stuff that you can’t do with the Reactive Extensions that they implemented, but it’s a bit harder to understand, you need to understand the plumbing inside and if you need to understand everything, then you need to think a little bit about these abstractions. So that is the trade-off, we thought like that Iteratees are, sometimes properties in Iteratees we don’t have in Reactive Extensions are interesting for us and very important for us and that is why we use this kind of abstractions, basically when a producer sends a message to a consumer, the consumer has the opportunity to tell that producer if it’s ready for the next message and that is fundamental, it has been fundamental for us for implementing a web server that doesn't leak memory when you have a lot of throughput, these kind of problems.
Werner: It seems that essentially what you are trying to do is to move away from telling the computer what to do and just declaring, making declarative statements, saying: “Pull these things in and …”.
I like to imagine it like knowing what the computer would do in this kind of thing, I would like it to do it. Erik [Meijer] likes to talk about leaky abstractions and I like this kind of thing and when I see filter, I’m not completely ignorant of how that thing would happen. I know the mechanics, maybe not the details but the mechanics, how would it filter a stream in that manner. And since I have some idea of that structure and the way it functions, I don’t want to write that code, I want just to say like: “Filter and this is a predicate you have to use to filter” and then everything will be taken a care of, because especially with reactive kind of streams you have a lot of concurrency problems that you need to solve and as developer, I don’t want to be writing this kind of stuff.
Werner: Erik also mentioned that Netflix uses Rx and they basically use it because they don’t want the large group of developers to deal with threads and locks and stuff like that and move all that, the dangerous parts into Rx.
Exactly and that is implemented in the library, you don’t want to worry about it because even smart developers will make a lot of mistakes and bugs inside in a very concurrent kind of environment.
Werner: Nobody wants to do all the boring locking and things like that, you want to do interesting things, want to solve a problem that not fiddle with bits.
There are some people that are interested in solving concurrent problems but it’s not a one day work, it takes a lot of time to ensure that this thing works properly the way you expected. And since you implemented it once, you don’t want to implement it twice for the same kind of thing, if you are filtering you don’t want to write the whole code again and again.
Werner: And those developers should work on the implementation of Rx or other things, so you can reuse it.
Yes in libraries, so that other developers can take advantage of this kind of work.
6. So talking about solving problems, you mentioned to me earlier that you don’t like the way content management systems (CMS) work, so you created your own one. Did you create your own [CMS] or did you create something similar?
That it’s a very big change of topic. Basically in all our projects, you have content everywhere no matter what your business is, you’ll have some content and the way that it works today, you pick up a CMS of your choice and I’m not going to give names but often like, I guess they are trying to solve too many problems, they are not focused, they try to solve like any problem that happens to be around them, let’s put everything like add features to this and I guess we should step back and see what are the hard problems that we need to solve in Content Management and what is easy to solve and leave that out.
Well that is what people are doing today because of constraints that you have in CMS because you can't separate it from presentation, but it’s a very bad thing because in the content industry for instance NPR like the radio, they have content and they tend to manage it in exactly that way, in the chunked way, they tell: “Ok, our content should be structured, we should have a structured kind of content chunked into different pieces and that is content” and now we need to integrate it with a lot of environments, a lot of devices, iPhone, iPad, iTunes, TV’s, websites and so on and so on, and since their content is structured and has proper metadata on it, the one that will integrate it, will understand it and will kind of show it the appropriate way, meaning like in iTunes maybe you’ll just show the title and the time and that is it, but you’ll not show the description and when you click on it now you get the description, whereas on an iPad maybe you have more plays, more space to show more content and the website will be completely different.
You can even say that now my content, if that thing, the content management stuff is doing structured content and offering me an API, I integrate it to all my devices but I can even suggest my content to third party that they integrate it in their own solutions and that is the good part of it. You have an API for free when you are functioning in that way, so I guess Content Management Systems should not be about presentation, should be strictly about how can we create content, how can we manage this content, manage the staging, the publication, the releases and offer an API that is easy to use and sophisticated enough for other devices, multichannel, to query that content in a sophisticated way, make like some very advanced searches into that content and pull in that content and show it on the screens.
That is very good, so you can think about it as a document database with services around it that will make filling this database and querying it for the domain of content management easy, you should make not only where we store this document, that is one problem that we should store them somewhere but also we should facilitate like creating content, modifying content, publishing content, making it available to these APIs, searching into it and so on. So there is a concept called content repository, so not only it’s a content repository but also you should allow all the life around content to be easy and manageable. Someone in the Content Strategy Committee, it's like most CMS, they look like as if the database got drunk and vomited the content on the screen, it’s not what you want to do, you need to have nice interface where you can manage your content, not just, you know, the database as it is, throw it a the interface and tell the users ”Here is the content, do something with it” you don’t want to do that, you want to make this task easy, that is what you are trying to solve.
The presentation part you are not solving because it’s kind of solved and there are a lot of frameworks doing that, you are focusing on the content problem and that is why you need to make also collaboration all these things. In a way you can imagine it, in a way, in some sense like what GitHub added to Git. Git is an excellent solution for storing your code and doing versioning, but what you need is someway to be able to manage it and that is a management tool. So that the CMS will do that on top of a database, document based database.
Imagine it if, I don’t know if it's thin or thick, it's a kind of management tool that will take this content, store it, make sure the storage is in the right way and offers you these features around content, and there are a lot of these features that you should be offering if you are serious about content management.
I didn’t mention that we are working on this kind of system right now and so basically as a user you’ll get your own repo and you will login in that repo and what you will see is your documents, all your documents, so you will be able to search into these documents, find the appropriate documents, you can create new documents and first of all you can also create schemas. You can tell like: “A product for me as a user, looks like a title, a description, specification and FAQ, and these are parts of my product, every product should have these.” Now you made this schema, the type of the document and then users will go ahead and create instances of these documents, so they will enter: “I’m creating a new product” and you have this interface, rich interface that will show them an editor, the editor of that document.
Now the editor of that document needs to be like interesting and appealing for them, it needs also to help them structure their content, and put the appropriate metadata on it, maybe you can call it a structured editor, it's not WYSIWYG. A WYSIWYG will mix presentation with content and this is exactly what we are trying to avoid here, that is why it’s an editor that will be more focused on structure. So that is the editor and then you will go and type some content and then you save it, and we will be like having all versions of these documents for you, because this is also crucial in content management, you do some modifications and then you go back and you abandon this modification and maybe later you’ll say like: “It was interesting” anyway and you go back to it and so on, and we need to make that very easy, because you can’t collaborate if each time you touch to the content you are scared that something will change, no, we don’t care.
Do your versions of document, it’s alright, it’s not going anywhere, it’s just for you and then at some point you can say like: “I would like to suggest this to that particular publication” and then someone will approve it or not, depends on your workflow and then it will go online. In the meantime you can go ahead and create your documents and that is also important, you know versioning and making collaboration easy. But this guy that will go reviewing these changes, he should not go ahead and read all the document, he see changes, in a way you see only what changed between these documents, so you need to do also is kind of a diff summary on these documents so that someone that is maybe Marketing Director or like maybe Content Manager in that company, can see the visual look of what happened in that document and you’re improving the collaboration and then he can say : “Ok, fixed some typos, I’m going to integrate that, that is alright, or no I’m not sure” he will comment and will say like this is “Not right, you should …. Oh, he changed the image here, he changed the link, ok that is alright” and so on. So that is the tool around this document database if you would like, the tooling around that allows you to have better collaboration around content and storage around content.
So we have and I guess anyone that wants to solve that CMS problem should have also an API that will help you search in these documents, find them and only get what is necessary for you because documents will be big and you say like: “I only need a title” to save some bandwidth and so on. So we offer a web API, a fully RESTful API meaning like it's based not only on links and GET methods and so on, but also based on forms, which means like if you call our API, you say “/API” to give you several forms that you should submit to get the next. Meaning that you only have one URL that you need to know, all of the rest, all the other APIs, they go from that document, so “/API” will give you a document. You can imagine it as a website, this is the main page and then you have different forms that it will guide you to for instance, you can have a form for searching collections, for searching documents and so on. From there on you use that form to get there, that is a RESTful API.
So we offer this API but also we offer like development kits so we will be offering for no matter the community, for Ruby, for Ruby on Rails or other frameworks, for PHP, for Java, for Scala, for Python, we offer to all these technologies. The kind of wrappers of this web API but that also look the same way we are doing the same structure in the web API and this web kit, so you don’t have any strange mapping between them. So you’ll find the same concepts about forms in your native API.
11. Here is another question: say I use your system and I create a new document instance in that system, and I have a website over here which uses your API, can your content system ping my web service to say: “Hey there is some new data here" or how is the interaction, is there some form of push from the content system?
It’s very easy to do a push on top of what we did, but what we did is that there is only one URL that changes when there is a change, where there is something pushed, so you have this main URL, it’s “/API” and “/API” is a document, JSON document or HTML document depending what you ask, and in that document you have a part that tells you: “This is the master” meaning like this is the thing that should be public on your integration wherever it happens. And with that ref it gives you a way to get documents for that particular release. When something changes, this is the only document that will change, of course we will create a lot of other documents but all the old ones, they will still exist, and that is the old version, and that is our way of keeping consistent, meaning like if you get this ref and then you query everything using that ref, even if in the meantime there is a new ref that happens, it doesn’t impact you because you are still talking to the old version that all of it is consistent.
Imagine you are building a website, you have a webpage, now you get the ref in the beginning of the session, you get the ref and use that ref for all the content of this webpage, you are guaranteed that all the content will be coherent because you are talking about the same ref. Mainly what I’m saying here is that documents are immutable, so whenever we change anything it will be a new copy of that document, which means like you will never get something incoherent and it has a lot of nice properties including that it will never be incoherent, something incoherent inside the same ref. Now imagine you are building a website and you use the API and you did this kind of ref in the beginning and then you used that ref in all your documents, and now imagine someone will push a new content in the master ref and now that’s what you get, if you ping this /API you’ll see that the new ref is different from the old ref and that is when you see this you know that there is a new content and now you go ahead and pull that content for that manner.
And this is very easy to turn into a push module because we know exactly when something has changed, because these refs and if you should compare your ref to the old ref to the ref that you get now you know it is the same, if it’s not the same it means something changed. So we can turn it easily into if you don’t want to keep pinging, normally a ping it’s alright for us but if you don’t want, you want to just have a push-module, it’s very easy to implement. All these APIs except the /API they are cached forever semantics because they never change, we create new ones but we don’t change these, which means makes it like very interesting for caching, all proxies will be caching them because you have two contradictory kind of properties, we have cache forever on old documents but we have simultaneous like whenever you publish straight away you get it on your website, there is no delay, no need to wait for cache and validation for instance, and that is what is interesting also about this structure.
Yes absolutely, there are a lot of ideas like what Rich [Hickey] had been talking about, there are things that are coming also from Git, of course it makes sense, and that is what we are trying to integrate into our…, and also there are things like we are trying to do is, when you are editing your content into your repo, you’re in the backend and you are modifying your content, that has a life cycle that is completely different from the life cycle of the integration API. Because there you lively changing content but you have some different characteristics, some different constrains on both environments and what we do actually is that you can imagine it that each publication, each release, we can take a snapshot of this world of the repo, a snapshot and we deploy that. Meaning like if you are changing, your changes will not be slowing down the API, it has nothing to do with that, just an index is just a snapshot of that repo that happens to be extracted and then deployed and then you’ll be talking to that one and anytime you publish a new content it will be another snapshot that it would be taken off that thing.
And that will solve a lot of problems of course, because if you are doing like some progressive indexing is very hard and it takes a lot of time, and if it gets corrupted you should build it from scratch and it takes extremely long time to extract the index from scratch, and that is what we are not doing. What we want to do is rather like take snapshots and extract these snapshots and this way we are separating these two worlds. And also imagine for some reason there is a machine that it will crash and we have to swap machines, it's very easy: just download these snapshots, throw them at the machine and now they are ready to be deployed and queried, so that will give us a lot of flexibility and scalability and performance.
No, we are not yet on name, there are several choices, I don’t know, we are thinking about some names, we are not talking about the name and if I tell you maybe when we publish it the name would not be what we chose, but we thought of different names, I can’t tell at which one will be using. [Editor's note: the product is available now, the name is prismic.io, the website is http://prismic.io ]
Werner: We'll add it to the transcript [Editor's note: it's prismic.io, http://prismic.io]
Maybe we can add the link or something like that, but most important in our product for me is that you go and search about structured content and separation between content and form, that is the most important part about it, and the product is an example of doing that, we think we are doing the right way, that is why we do the product. But there are a lot of talks around, by the time we are doing this interview are quite some talks talk about content strategy and why you should separate these and why is it very important for you and your company to separate content from form, so you should be reading about that maybe by the time will have also some content talking about that, so basically read more about this kind of subjects and if we happen to have the URL maybe will be communicated, will be just an example of this kind of approaches.
It’s a very hard question because there are so many interesting Monads in there and I don’t know like what 'favourite' means here, is it like representing me or is it like I like this Monad because it seems to be cool. All Monads are nice, some will tell you like the Continuation Monad because it's the Monad of all Monads, you can do any Monad with a Continuation Monad. There is the free Monad that I really like because it has some very nice semantics. It’s very hard, let’s say I like the Identity Monad. Identity Monad is very boring, it doesn’t do anything but it has a concept, it’s the simplest Monad ever but doesn’t do you a lot.
16. What does it do?
Actually it does nothing, it basically illustrate how can you, the Monad Structure in a very easy way, but you can't do much with it and I like it, I like it because it’s a very good example of the concept of the Monad like about its structure and how is structured and the plumbing inside the Monad, also the Option Monad gives you this kind of properties.
Werner: Except that one does something.
That one does something. Let’s say the IO Monad. IO Monad is my favourite Monad because with IO you can do anything and this monad that will be scripting your actions, it’s like scripting your actions what you want to do with IO.
Werner: Without the IO Monad we'd just be heating up the CPU...
You can observe them whenever you’ve run that program by the end, eventually when you have run the program but while why you are doing it you can imagine if you are writing some script, you are composing your program, you are writing a program, you are writing code that writes programs, this is the way you can imagine it.
Werner: Thank you very much Sadek!
Thank you for having me!